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IMMUNISATION AGAINST CHLAMYDIA PNEUMONIAE 
All documents cited herein are incorporated by reference in their entirety. 

TECHNICAL FIELD 

This invention is in the field of immunisation against chlamydial infection, in particular against 
infection by Chlamydia pneumoniae. 

BACKGROUND ART 

Chlamydiae are obligate intracellular parasites of eukaryotic cells which are responsible for endemic 
sexually transmitted infections and various other disease syndromes. They occupy an exclusive 
eubacterial phylogenic branch, having no close relationship to any other known organisms - they are 
classified in their own order (Chlamydiales) which contains a single family (Chlamydiaceae) which 
in turn contains a single genus {Chlamydia). A particular characteristic of the Chlamydiae is their 
unique life cycle, in which the bacterium alternates between two morphologically distinct forms: an 
extracellular infective form (elementary bodies, EB) and an intracellular non-infective form 
(reticulate bodies, RB). The life cycle is completed with the re-organization of RB into EB, which 
subsequently leave the disrupted host cell ready to infect further cells. 

Four chlamydial species are currently known - Ctrachomatis, Cpneumoniae, Cpecorum and 
C.psittaci [e.g. Raulston (1995) Mol Microbiol 15:607-616; Everett (2000) Vet Microbiol 75:109- 
126]. Cpneumoniae is closely related to Ctrachomatis \ as the whole genome comparison of at least 
two isolates from each species has shown [Kalman et al. (1999) Nature Genetics 21:385-389; Read 
et al (2000) Nucleic Acids Res 28:1397-406; Stephens et al (1998) Science 282:754-759]. Based on 
surface reaction with patient immune sera, the current view is that only one serotype of 
Cpneumoniae exists world-wide. 

Cpneumoniae is a common cause of human respiratory disease. It was first isolated from the 
conjunctiva of a child in Taiwan in .1965, and was established as a major respiratory pathogen in 
1983. In the USA, Cpneumoniae causes approximately 10% of community-acquired pneumonia and 
5% of pharyngitis, bronchitis, and sinusitis. 

More recendy, the spectrum of Cpneumoniae infections has been extended to include 
atherosclerosis, coronary heart disease, carotid artery stenosis, myocardial infarction, cerebrovascular 
disease, aortic aneurysm, claudication, and stroke. The association of Cpneumoniae with 
atherosclerosis is corroborated by the presence of the organism in atherosclerotic lesions throughout 
the arterial tree and the near absence of the organism in healthy arterial tissue. Cpneumoniae has 
also been isolated from coronary and carotid atheromatous plaques. The bacterium has also been 
associated with other acute and chronic respiratory diseases (e.g. otitis media, chronic obstructive 
pulmonary disease, pulmonary exacerbation of cystic fibrosis) as a result of sero-epidemiologic 
observations, case reports, isolation or direct detection of the organism in specimens, and successful 
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response to anti-chlamydial antibiotics. To determine whether chronic infection plays a role in 
initiation or progression of disease, intervention studies in humans have been initiated, and animal 
models of Cpneumoniae infection have been developed. 

Considerable knowledge of the epidemiology of Cpneumoniae infection has been derived from 
5 serologic studies using the Cpneumoniae-specific microimmunofluorescence test. Infection is 
ubiquitous, and it is estimated that virtually everyone is infected at some point in life, with common 
re-infection. Antibodies against C.pneumoniae are rare in children under the age of 5, except in 
developing and tropical countries. Antibody prevalence increases rapidly at ages 5 to 14, reaching 
50% at the age of 20, and continuing to increase slowly to -80% by age 70. 

10 A current hypothesis is that C.pneumoniae can persist in an asymptomatic low-grade infection in 
very large sections of the human population. When this cpndition occurs, it believed that the 
presence of C.pneumoniae, and/or the effects of the host reaction to the bacterium, can cause or help 
progress of cardiovascular illness. 

It is not yet clear whether Cpneumoniae is actually a causative agent of cardiovascular disease, or 
15 whether it is just artefactually associated with it. It has been shown, however, that Cpneumoniae 
infection can induce LDL oxidation by human monocytes [Kalayoglu et al. (1999) J. Infect. Dis. 
180:780-90; Kalayoglu et al. (1999) Am. Heart J. 138:S488-490]. As LDL oxidation products are 
highly atherogenic, this observation provides a possible mechanism whereby Cpneumoniae may 
cause atheromatous degeneration. If a causative effect is confirmed, vaccination (prophylactic and 
20 therapeutic) will be universally recommended. 

Genomic sequence information has been published for Cpneumoniae [Kalman et al. (1999) supra; 
Read et al. (2000) supra; Shirai et al. (2000) / Infect. Dis. 181(Suppl 3):S524-S527; WO99/27105; 
WOOO/27994] and is available from GenBank. Sequencing efforts have not, however, focused on 
vaccination, and the availability of genomic sequence does not in itself indicate which of the >1000 
25 genes might encode useful antigens for immunisation and vaccination. WO99/27105, for instance, 
implies that every one of the 1296 ORFs identified in the Cpneumoniae strain CM1 genome is a 
useful vaccine antigen. 

It is thus an object of the present invention to identify antigens useful for vaccine production and 
development from amongst the many proteins present in Cpneumoniae. It is a further object to 
30 identify antigens useful for diagnosis (e.g. immunodiagnosis) of Cpneumoniae. 

DISCLOSURE OF THE INVENTION 

The invention provides proteins comprising the Cpneumoniae amino acid sequences disclosed in the 
examples. 

It also provides proteins comprising sequences which share at least x% sequence identity with the 
35 Cpneumoniae amino acid sequences disclosed in the examples. Depending on the particular 
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sequence, x is preferably 50% or more (e.g. 60%, 70%, 80%, 90%, 95%, 99% or more). These 
include mutants and allelic variants. Typically, 50% identity or more between two proteins is 
considered to be an indication of functional equivalence. Identity between proteins is preferably 
detemiined by the Smith- Waterman homology search algorithm as implemented in the MPSRCH 
5 program (Oxford Molecular), using an affine gap search with parameters gap open penalty=12 and 
gap extension penalty^]. 

The invention further provides proteins comprising fragments of the C.pneumoniae amino acid 
sequences disclosed in the examples. The fragments should comprise at least n consecutive amino 
acids from the sequences and, depending on the particular sequence, n is 7 or more (e.g. 8, 10, 12, 
10 14, 16, 18, 20, 30, 40, 50, 75, 100 or more). Preferably the fragments -comprise one or more 
epitope(s) from the sequence. Other preferred fragments omit a signal peptide. 

The proteins of the invention can, of course,, be prepared by various means (e.g. native expression, 
recombinant expression, purification from cell culture, chemical synthesis etc.) and in various forms 
(e.g. native, fusions etc.). They are preferably prepared in substantially pure form (ie. substantially 
15 free from other C.pneumoniae or host cell proteins). Heterologous expression in E.coli is a preferred 
preparative route. 

According to a further aspect, the invention provides nucleic acid comprising the C.pneumoniae 
nucleotide sequences disclosed in the examples. In addition, the invention provides nucleic acid 
comprising sequences which share at least x% sequence identity with the C.pneumoniae nucleotide 
20 sequences disclosed in the examples. Depending on the particular sequence, x is preferably 50% or 
more (e.g. 60%, 70%, 80%, 90%, 95%, 99% or more). 

Furthermore, the invention provides nucleic acid which can hybridise to the C.pneumoniae nucleic 
acid disclosed in the examples, preferably under "high stringency" conditions (e.g. 65°C in a 
O.lxSSC, 0.5% SDS solution). 

25 Nucleic acid comprising fragments of these sequences are also provided. These should comprise at 
least n consecutive nucleotides from the C.pneumoniae sequences and, depending on the particular 
sequence, n is 10 or more (e.g. 12, 14, 15, 18, 20, 25, 30, 35, 40, 50, 75, 100, 200, 300 or more). 

According to a further aspect, the invention provides nucleic acid encoding the proteins and protein 
fragments of the invention. 

30 It should also be appreciated that the invention provides nucleic acid comprising sequences 
complementary to those described above (e.g. for antisense or probing purposes). 

Nucleic acid according to the invention can, of course, be prepared in many ways (e.g. by chemical 
synthesis, from genomic or cDNA libraries, from the organism itself etc.) and can take various forms 
(e.g. single stranded, double stranded, vectors, probes etc.). 
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In addition, the term "nucleic acid" includes DNA and RNA, and also their analogues, such as those 
containing modified backbones, and also peptide nucleic acids (PNA) etc. 

According to a further aspect, the invention provides vectors comprising nucleotide sequences of the 
invention {e.g. cloning or expression vectors) and host cells transformed therewith. 

5 According to a further aspect, the invention provides immunogenic compositions comprising protein 
and/or nucleic acid according to the invention. These compositions are suitable for immunisation and 
vaccination purposes. Vaccines of the invention may be prophylactic or therapeutic, and will 
typically comprise an antigen which can induce antibodies capable of inhibiting (a) chlamydial 
adhesion, (b) chlamydial entry, and/or (c) successful replication within the host cell. The vaccines 
10 preferably induce any cell-mediated T-cell responses which are necessary for chlamydial clearance 
from the host. 

The invention also provides nucleic acid or protein according to the invention for use as 
medicaments {e.g. as vaccines). It also provides the use of nucleic acid or protein according to the 
invention in the manufacture of a medicament {e.g. a vaccine or an immunogenic composition) for 
15 treating or preventing infection due to C.pneumoniae. 

The invention also provides a method of treating {e.g. immunising) a patient, comprising 
administering to the patient a therapeutically effective amount of nucleic acid or protein according to 
the invention. 

According to further aspects, the invention provides various processes. 

20 A process for producing proteins of the invention is provided, comprising the step of culturing a host 
cell according to the invention under conditions which induce protein expression. 

A process for producing protein or nucleic acid of the invention is provided, wherein the protein or 
nucleic acid is synthesised in part or in whole using chemical means. 

A process for detecting C.pneumoniae in a sample is provided, wherein the sample is contacted with 
25 an antibody which binds to a protein of the invention . 

A summary of standard techniques and procedures which may be employed in order to perform the 
invention {e.g. to utilise the disclosed sequences for immunisation) follows. This summary is not a 
limitation on the invention but, rather, gives examples that may be used, but are not required. 
General 

30 The practice of the present invention will employ, unless otherwise indicated, conventional techniques of 
molecular biology, microbiology, recombinant DNA, and immunology, which are within the skill of the art. 
Such techniques are explained fully in the literature e.g. Sambrook Molecular Cloning; A Laboratory Manual, 
Second Edition (1989) and Third Edition (2001); DNA Cloning, Volumes 1 and ii (D.N Glover ed. 1985); 
Oligonucleotide Synthesis (M.J. Gait ed, 1984); Nucleic Acid Hybridization (B.D. Hames & S.J. Higgins eds. 

35 1984); Transcription and Translation (B.D. Hames & S.J. Higgins eds. 1984); Animal Cell Culture (R.I. 
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Freshney ed. 1986); Immobilized Cells and Enzymes (IRL Press, 1986); B. Perbal, A Practical Guide to 
Molecular Cloning (1984); the Methods in Enzymology series (Academic Press, Inc.), especially volumes 154 & 
155; Gene Transfer Vectors for Mammalian Cells (J.H. Miller and M.P. Calos eds. 1987, Cold Spring Harbor 
Laboratory); Mayer and Walker, eds. (1987), Immunochemical Methods in Cell and Molecular Biology 
5 (Academic Press, London); Scopes, (1987) Protein Purification: Principles and Practice, Second Edition 
(Springer-Verlag, N.Y.), and Handbook of Experimental Immunology, Volumes l-IV (D.M. Weir and C. C. 
Blackwelleds 1986). 

Standard abbreviations for nucleotides and amino acids are used in this specification. 
Definitions 

10 A composition containing X is "substantially free of Y when at least 85% by weight of the total X+Y in the 
composition is X. Preferably, X comprises at least about 90% by weight of the total of X+Y in the composition, 
more preferably at least about 95% or even 99% by weight. 

The term "comprising" means "including" as well as "consisting" e.g. a composition "comprising" X may 
consist exclusively of X or may include something additional to X, such as X+Y. 

15 The term "heterologous" refers to two biological components that are not found together in nature. The 
components may be host cells, genes, or regulatory regions, such as promoters. Although the heterologous 
components are not found together in nature, they can function together, as when a promoter heterologous to a 
gene is operably linked to the gene. Another example is where a Chlamydial sequence is heterologous to a 
mouse host cell. A further examples would be two epitopes from the same or different proteins which have been 

20 assembled in a single protein in an arrangement not found in nature. 

An "origin of replication" is a polynucleotide sequence that initiates and regulates replication of polynucleotides, 
such as an expression vector. The origin of replication behaves as an autonomous unit of polynucleotide 
replication within a cell, capable of replication under its own control. An origin of replication may be needed for 
a vector to replicate in a particular host cell. With certain origins of replication, an expression vector can be 
25 reproduced at a high copy number in the presence of the appropriate proteins within the cell. Examples of 
origins are the autonomously replicating sequences, which are effective in yeast; and the viral T-antigen, 
effective in COS-7 cells. 

A "mutant" sequence is defined as DNA, RNA or amino acid sequence differing from but having sequence 
identity with the native or disclosed sequence. Depending on the particular sequence, the degree of sequence 

30 identity between the native or disclosed sequence and the mutant sequence is preferably greater than 50% (e.g. 
60%, 70%, 80%, 90%, 95%, 99% or more, calculated using the Smith-Waterman algorithm as described above). 
As used herein, an "allelic variant" of a nucleic acid molecule, or region, for which nucleic acid sequence is 
provided herein is a nucleic acid molecule, or region, that occurs essentially at the same locus in the genome of 
another or second isolate, and that, due to natural variation caused by, for example, mutation or recombination, 

35 has a similar but not identical nucleic acid sequence. A coding region allelic variant typically encodes a protein 
having similar activity to that of the protein encoded by the gene to which it is being compared. An allelic 
variant can also comprise an alteration in the 5* or 3* untranslated regions of the gene, such as in regulatory 
control regions {e.g. see US patent 5,753,235). 
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Expression systems 

The Chlamydial nucleotide sequences can be expressed in a variety of different expression systems; for example 
those used with mammalian cells, baculoviruses, plants, bacteria, and yeast. 

i. Mammalian Systems 

5 Mammalian expression systems are known in the art. A mammalian promoter is any DNA sequence capable of 
binding mammalian RNA polymerase and initiating the downstream (3') transcription of a coding sequence (e.g. 
structural gene) into mRNA. A promoter will have a transcription initiating region, which is usually placed 
proximal to the 5' end of the coding sequence, and a TATA box, usually located 25-30 base pairs (bp) upstream 
of the transcription initiation site. The TATA box is thought to direct RNA polymerase II to begin RNA 
10 synthesis at the correct site. A mammalian promoter will also contain an upstream promoter element, usually 
located within 100 to 200 bp upstream of the TATA box. An upstream promoter element determines the rate at 
which transcription is initiated and can act in either orientation (Sambrook et al. (1989) "Expression of Cloned 
Genes in Mammalian Cells. w In Molecular Cloning: A Laboratory Manual, 2nd ed.]. 

Mammalian viral genes are often highly expressed and have a broad host range; therefore sequences encoding 
15 mammalian viral genes provide particularly useful promoter sequences. Examples include the SV40 early 
promoter, mouse mammary tumor virus LTR promoter, adenovirus major late promoter (Ad MLP), and herpes 
simplex virus promoter. In addition, sequences derived from non-viral genes, such as the murine 
metallotheionein gene, also provide useful promoter sequences. Expression may be either constitutive or 
regulated (inducible), depending on the promoter can be induced with glucocorticoid in hormone-responsive 
20 cells. 

The presence of an enhancer element (enhancer), combined with the promoter elements described above, will 
usually increase expression levels. An enhancer is a regulatory DNA sequence that can stimulate transcription up 
to 1000-fold when linked to homologous or heterologous promoters, with synthesis beginning at the normal 
RNA start site. Enhancers are also active when they are placed upstream or downstream from the transcription 

25 initiation site, in either normal or flipped orientation, or at a distance of more than 1000 nucleotides from the 
promoter [Maniatis et al. (1987) Science 236:1237; Alberts et al. (1989) Molecular Biology of the Cell, 2nd ed.]. 
Enhancer elements derived from viruses may be particularly useful, because they usually have a broader host 
range. Examples include the SV40 early gene enhancer [Dijkema et al (1985) EMBO J. 4:761] and the 
enhancer/promoters derived from the long terminal repeat (LTR) of the Rous Sarcoma Virus [Gorman et al. 

30 (1982) PNAS USA 79:6777] and from human cytomegalovirus [Boshart et al. (1985) Cell 47:521]. Additionally, 
some enhancers are regulatable and become active only in the presence of an inducer, such as a hormone or 
metal ion [Sassone-Corsi and Borelli (1986) Trends Genet. 2:215; Maniatis et al. (1987) Science 236:1237]. 

A DNA molecule may be expressed intracellular^ in mammalian cells. A promoter sequence may be directly 
linked with the DNA molecule, in which case the first amino acid at the N-terroinus of the recombinant protein 
35 will always be a methionine, which is encoded by the ATG start codon. If desired, the N-terminus may be 
cleaved from the protein by in vitro incubation with cyanogen bromide. 

Alternatively, foreign proteins can also be secreted from the cell into the growth media by creating chimeric 
DNA molecules that encode afusion protein comprised of a leader sequence fragment that provides for secretion 
of the foreign protein in mammalian cells. Preferably, there are processing sites encoded between the leader 
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fragment and the foreign gene that can be cleaved either in vivo or in vitro. The leader sequence fragment 
usually encodes a signal peptide comprised of hydrophobic amino acids which direct the secretion of the protein 
from the cell. The adenovirus triparite leader is an example of a leader sequence that provides for secretion of a 
foreign protein in mammalian cells. 

5 Usually, transcription termination and polyadenylation sequences recognized by mammalian cells are regulatory 
regions located 3' to the translation stop codon and thus, together with the promoter elements, flank the coding 
sequence. The 3' terminus of the mature mRNA is formed by site-specific post-transcriptional cleavage and 
polyadenylation [Birnstiel et al. (1985) Cell 47:349; Proudfoot and Whitelaw (1988) "Termination and 3* end 
processing of eukaryotic RNA. In Transcription and splicing (ed. B.D. Hames and D.M. Glover); Proudfoot 
10 (1989) Trends Biochem. ScL 74:1 05]. These sequences direct the transcription of an mRNA which can be 
translated into the polypeptide encoded by the DNA. Examples of transcription terminater/polyadenylation 
signals include those derived from SV40 [Sambrook et al (1989) "Expression of cloned genes in cultured 
mammalian cells." In Molecular Cloning: A Laboratory Manual], 

Usually, the above described components, comprising a promoter, polyadenylation signal, and transcription 
15 termination sequence are put together into expression constructs. Enhancers, introns with functional splice donor 
and acceptor sites, and leader sequences may also be included in an expression construct, if desired. Expression 
constructs are often maintained in a replicon, such as an extrachromosomal element {e.g. plasmids) capable of 
stable maintenance in a host, such as mammalian cells or bacteria. Mammalian replication systems include those 
derived from animal viruses, which require trans-acting factors to replicate. For example, plasmids containing 
20 the replication systems of papovaviruses, such as SV40 [Gluzman (1981) Cell 25:175] or polyomavirus, 
replicate to extremely high copy number in the presence of the appropriate viral T antigen. Additional examples 
of mammalian replicons include those derived from bovine papillomavirus and Epstein-Barr virus. Additionally, 
the replicon may have two replicaton systems, thus allowing it to be maintained, for example, in mammalian 
cells for expression and in a prokaryotic host for cloning and amplification. Examples of such mammalian- 
25 bacteria shuttle vectors include pMT2 [Kaufman et a!. (1989) Mol Cell Biol 9:946] and pHEBO [Shimizu et al. 
(1986) Mol Cell Biol 6:1074]. 

The transformation procedure used depends upon the host to be transformed. Methods for introduction of 
heterologous polynucleotides into mammalian cells are known in the art and include dextran-mediated 
transfection, calcium phosphate precipitation, poiybrene-mediated transfection, protoplast fusion, 
30 electroporation, encapsulation of polynucleotide(s) in liposomes, direct microinjection of the DNA into nuclei. 

Mammalian cell lines available as hosts for expression are known in the art and include many immortalized cell 
lines available from the American Type Culture Collection (ATCC), including but not limited to, Chinese 
hamster ovary (CHO) cells, HeLa cells, baby hamster kidney (BHK) cells, monkey kidney cells (COS), human 
hepatocellular carcinoma cells {e.g. Hep G2), and a number of other cell lines. 

35 jjjacttlovims Systems 

The polynucleotide encoding the protein can also be inserted into a suitable insect expression vector, and is, 
operably linked to the control elements within that vector. Vector construction employs techniques which are 
known in the art. Generally, the components of the expression system include a transfer vector, usually a 
bacterial plasm id, which contains both a fragment of the baculovirus genome, and a convenient restriction site 
40 for insertion of the heterologous gene or genes to be expressed; a wild type baculovirus with a sequence 



WO 02/02606 PCT/IB01/01445 

-8- 

homologous to the baculo virus-specific fragment in the transfer vector (this allows for the homologous 
recombination of the heterologous gene in to the baculovirus genome); and appropriate insect host cells and 
growth media. 

After inserting the DNA sequence encoding the protein into the transfer vector, the vector and the wild type viral 
5 genome are transfected into an insect host cell where the vector and viral genome are allowed to recombine. The 
packaged recombinant virus is expressed and recombinant plaques are identified and purified. Materials and 
methods for baculovirus/insect cell expression systems are commercially available in kit form from, inter alia, 
Invitrogen, San Diego CA ("MaxBac" kit). These techniques are generally known to those skilled in the art and 
fully described in Summers and Smith, Texas Agricultural Experiment Station Bulletin No. 1555 (1987) 
10 (hereinafter "Summers and Smith"). 

Prior to inserting the DNA sequence encoding the protein into the baculovirus genome, the above described 
components, comprising a promoter, leader (if desired), coding sequence of interest, and transcription 
termination sequence, are usually assembled into an intermediate transplacement construct (transfer vector). This 
construct may contain a single gene and operably linked regulatory elements; multiple genes, each with its 
15 owned set of operably linked regulatory elements; or multiple genes, regulated by the same set of regulatory 
elements. Intermediate transplacement constructs are often maintained in a replicon, such as an 
extrachromosomal element (e.g. plasmids) capable of stable maintenance in a host, such as a bacterium. The 
replicon will have a replication system, thus allowing it to be maintained in a suitable host for cloning and 
amplification. 

20 Currently, the most commonly used transfer vector for introducing foreign genes into AcNPV is pAc373. Many 
other vectors, known to those of skill in the art, have also been designed. These include, for example, pVL985 
(which alters the polyhedrin start codon from ATG to ATT, and which introduces a BamHI cloning site 32 
basepairs downstream from the ATT; see Luckow and Summers, Virology (1989) 77:31. 

The plasmid usually also contains the polyhedrin polyadenylation signal (Miller et al. (1988) Ann. Rev. 
25 Microbiol, 42:177) and a prokaryotic ampicillin-resistance (amp) gene and origin of replication for selection 
and propagation in E. coli. 

Baculovirus transfer vectors usually contain a baculovirus promoter. A baculovirus promoter is any DNA 
sequence capable of binding a baculovirus RNA polymerase and initiating the downstream (5' to 3') transcription 
of a coding sequence (e.g. structural gene) into mRNA. A promoter will have a transcription initiation region 
30 which is usually placed proximal to the 5' end of the coding sequence. This transcription initiation region usually 
includes an RNA polymerase binding site and a transcription initiation site. A baculovirus transfer vector may 
also have a second domain called an enhancer, which, if present, is usually distal to the structural gene. 
Expression may be either regulated or constitutive. 

Structural genes, abundantly transcribed at late times in a viral infection cycle, provide particularly useful 
35 promoter sequences. Examples include sequences derived from the gene encoding the viral polyhedron protein, 
Friesen et al., (1986) D The Regulation of Baculovirus Gene Expression," in: The Molecular Biology of 
Baculoviruses (ed. Walter Doerfler); EPO Publ. Nos. 127 839 and 155 476; and the gene encoding the plO 
protein, Vlak et al., (1988), J. Gen. Virol. 69:165. 

DNA encoding suitable signal sequences can be derived from genes for secreted insect or baculovirus proteins, 
40 such as the baculovirus polyhedrin gene (Carbonell et al. (1988) Gene, 75:409). Alternatively, since the signals 
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for mammalian cell posttranslational modifications (such as signal peptide cleavage, proteolytic cleavage, and 
phosphorylation) appear to be recognized by insect cells, and the signals required for secretion and nuclear 
accumulation also appear to be conserved between the invertebrate cells and vertebrate cells, leaders of non- 
insect origin, such as those derived from genes encoding human oc-interferon, Maeda et al., (1985), Nature 
5 375:592; human gastrin-releasing peptide, Lebacq-Verheyden et al., (1988), Molec. Cell. Biol. 8:3129; human 
IL-2, Smith et al., (1985) Proc. Nat'l Acad. ScL USA, «2:8404; mouse IL-3, (Miyajima et al., (1987) Gene 
55:273; and human glucocerebrosidase, Martin et al. (1988) DNA, 7:99, can also be used to provide for secretion 
in insects. 

A recombinant polypeptide or polyprotein may be expressed intracellular^ or, if it is expressed with the proper 
10 regulatory sequences, it can be secreted. Good intracellular expression of nonfused foreign proteins usually 
requires heterologous genes that ideally have a short leader sequence containing suitable translation initiation 
signals preceding an ATG start signal. If desired, methionine at the N-terminus may be cleaved from the mature 
protein by in vitro incubation with cyanogen bromide. 

Alternatively, recombinant polyproteins or proteins which are not naturally secreted can be secreted from the 
15 insect cell by creating chimeric DNA molecules that encode a fusion protein comprised of a leader sequence 
fragment that provides for secretion of the foreign protein in insects. The leader sequence fragment usually 
encodes a signal peptide comprised of hydrophobic amino acids which direct the translocation of the protein into 
the endoplasmic reticulum. 

After insertion of the DNA sequence and/or the gene encoding the expression product precursor of the protein, 
20 an insect cell host is co-transformed with the heterologous DNA of the transfer vector and the genomic DNA of 
wild type baculovirus - usually by co-transfection. The promoter and transcription termination sequence of the 
construct will usually comprise a 2-5kb section of the baculovirus genome. Methods for introducing 
heterologous DNA into the desired site in the baculovirus virus are known in the art. (See Summers and Smith 
supra; Ju et al. (1987); Smith et al., Mol. Cell. Biol. (1983) 3:2156; and Luckow and Summers (1989)). For 
25 example, the insertion can be into a gene such as the polyhedrin gene, by homologous double crossover 
recombination; insertion can also be into a restriction enzyme site engineered into the desired baculovirus gene. 
Miller et al., (1989), Bioessays 4:91.The DNA sequence, when cloned in place of the polyhedrin gene in the 
expression vector, is flanked both 5' and 3' by polyhedrin-specific sequences and is positioned downstream of 
the polyhedrin promoter. 

30 The newly formed baculovirus expression vector is subsequently packaged into an infectious recombinant 
baculovirus. Homologous recombination occurs at low frequency (between -1% and -5%); thus, the majority of 
the virus produced after cotransfection is still wild-type virus. Therefore, a method is necessary to identify 
recombinant viruses. An advantage of the expression system is a visual screen allowing recombinant viruses to 
be distinguished. The polyhedrin protein, which is produced by the native virus, is produced at very high levels 

35 in the nuclei of infected cells at late times after viral infection. Accumulated polyhedrin protein forms occlusion 
bodies that also contain embedded particles. These occlusion bodies, up to 15pm in size, are highly refractile, 
giving them a bright shiny appearance that is readily visualized under the light microscope. Cells infected with 
recombinant viruses lack occlusion bodies. To distinguish recombinant virus from wild-type virus, the 
transfection supernatant is plaqued onto a monolayer of insect cells by techniques known to those skilled in the 

40 art. Namely, the plaques are screened under the light microscope for the presence (indicative of wild-type virus) 
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or absence (indicative of recombinant virus) of occlusion bodies. "Current Protocols in Microbiology 0 Vol. 2 
(Ausubel et al. eds) at 16.8 (Supp. 10, 1990); Summers & Smith, supra\ Miller et ah (1989). 

Recombinant baculovirus expression vectors have been developed for infection into several insect cells. For 
example, recombinant baculoviruses have been developed for, infer alia\ Aedes aegypti , Autographa 
5 californica, Bombyx won, Drosophila melanogaster, Spodoptera frugiperda, and Trichoplusia ni (WO 
89/046699; Carboneli et aI M (1985) /. Virol 55:153; Wright (1986) Nature 527:718; Smith et al., (1983) Mol 
Cell. Biol. J:21 56; and see generally, Fraser, et al. (1989) In Vitro Cell. Dev. Biol 25:225). 

Cells and cell culture media are commercially available for both direct and fusion expression of heterologous 
polypeptides in a baculovirus/expression system; cell culture technology is generally known to those skilled in 
10 the art. See, e.g. Summers and Smith supra. 

The modified insect cells may then be grown in an appropriate nutrient medium, which allows for stable 
maintenance of the plasmid(s) present in the modified insect host. Where the expression product gene is under 
inducible control, the host may be grown to high density, and expression induced. Alternatively, where 
expression is constitutive, the product will be continuously expressed into the medium and the nutrient medium 

15 must be continuously circulated, while removing the product of interest and augmenting depleted nutrients. The 
product may be purified by such techniques as chromatography, e.g. HPLC, affinity chromatography, ion 
exchange chromatography, etc.; electrophoresis; density gradient centrifugation; solvent extraction, or the like. 
As appropriate, the product may be further purified, as required, so as to remove substantially any insect proteins 
which are also secreted in the medium or result from lysis of insect cells, so as to provide a product which is at 

20 least substantially free of host debris, e.g. proteins, lipids and polysaccharides. 

In order to obtain protein expression, recombinant host cells derived from the transformants are incubated under 
conditions which allow expression of the recombinant protein encoding sequence. These conditions will vary, 
dependent upon the host cell selected. However, the conditions are readily ascertainable to those of ordinary skill 
in the art, based upon what is known in the art. 

25 iii. Plant Systems 

There are many plant cell culture and whole plant genetic expression systems known in the art. Exemplary plant 
cellular genetic expression systems include those described in patents, such as: US 5,693,506; US 5,659,122; 
. and US 5,608,143. Additional examples of genetic expression in plant cell culture has been described by Zenk, 
Phy to chemistry 30:3861-3863 (1991). Descriptions of plant protein signal peptides may be found in addition to 

30 the references described above in Vaulcombe et al., Mol Gen. Genet. 209:33-40 (1987); Chandler et al., Plant 
Molecular Biology 3:407-418 (1984); Rogers, /. Biol Chem. 260:3731-3738 (1985); Rothstein et al., Gene 
55:353-356 (1987); Whittier et al., Nucleic Acids Research 15:2515-2535 (1987); Wirsel et al., Molecular 
Microbiology 3:3-14 (1989); Yu et al., Gene 122:247-253 (1992). A description of the regulation of plant gene 
expression by the phytohormone, gibberellic acid and secreted enzymes induced by gibberellic acid can be found 

35 in R.L. Jones and J. MacMillin, Gibberellins: in: Advanced Plant Physiology,. Malcolm B. Wilkins, ed., 1984 
Pitman Publishing Limited, London, pp. 21-52. References that describe other metabolically-regulated genes: 
Sheen, Plant Cell, 2:1027-1038(1990); Maas et al., EMBO J. 9:3447-3452 (1990); Benkel and Hickey, Proc. 
Natl Acad. Scl 84:1337-1339 (1987) 
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Typically, using techniques known in the art, a desired polynucleotide sequence is inserted into an expression 
cassette comprising genetic regulatory elements designed for operation in plants. The expression cassette is 
inserted into a desired expression vector with companion sequences upstream and downstream from the 
expression cassette suitable for expression in a plant host. The companion sequences will be of plasmid or viral 
5 origin and provide necessary characteristics to the vector to permit the vectors to move DNA from an original 
cloning host, such as bacteria, to the desired plant host. The basic bacterial/plant vector construct will preferably 
provide a broad host range prokaryote replication origin; a prokaryote selectable marker; and, for Agrobacterium 
transformations, T DNA sequences for Agrobacterium-mediated transfer to plant chromosomes. Where the 
heterologous gene is not readily amenable to detection, the construct will preferably also have a selectable 
10 marker gene suitable for determining if a plant cell has been transformed. A general review of suitable markers, 
for example for the members of the grass family, is found in Wilmink and Dons, 1993, Plant MoL Biol. Reptr, 
11(2):165-185. 

Sequences suitable for permitting integration of the heterologous sequence into the plant genome are also 
recommended. These might include transposon sequences and the like for homologous recombination as well as 
15 Ti sequences which permit random insertion of a heterologous expression cassette into a plant genome. Suitable 
prokaryote selectable markers include resistance toward antibiotics such as ampicillin or tetracycline. Other 
DNA sequences encoding additional functions may also be present in the vector, as is known in the art. 

The nucleic acid molecules of the subject invention may be included into an expression cassette for expression 
of the protein(s) of interest. Usually, there will be only one expression cassette, although two or more are 
20 feasible. The recombinant expression cassette will contain in addition to the heterologous protein encoding 
sequence the following elements, a promoter region, plant 5' untranslated sequences, initiation codon depending 
upon whether or not the structural gene comes equipped with one, and a transcription and translation termination 
sequence. Unique restriction enzyme sites at the 5' and 3' ends of the cassette allow for easy insertion into a pre- 
existing vector. 

25 A heterologous coding sequence may be for any protein relating to the present invention. The sequence encoding 
the protein of interest will encode a signal peptide which allows processing and translocation of the protein, as 
appropriate, and will usually lack any sequence which might result in the binding of the desired protein of the 
invention to a membrane. Since, for the most part, the transcriptional initiation region will be for a gene which is 
expressed and translocated during germination, by employing the signal peptide which provides for 

30 translocation, one may also provide for translocation of the protein of interest. In this way, the protein(s) of 
interest will be translocated from the cells in which they are expressed and may be efficiently harvested. 
Typically secretion in seeds are across the aleurone or scutellar epithelium layer into the endosperm of the seed. 
While it is not required that the protein be secreted from the cells in which the protein is produced, this 
facilitates the isolation and purification of the recombinant protein. 

35 Since the ultimate expression of the desired gene product will be in a eucaryotic cell it is desirable to determine 
whether any portion of the cloned gene contains sequences which will be processed out as introns by the host's 
splicosome machinery. If so, site-directed mutagenesis of the "intron" region may be conducted to prevent losing 
a portion of the genetic message as a false intron code, Reed and Maniatis, Cell 41:95-105, 1985. 

The vector can be microinjected directly into plant cells by use of micropipettes to mechanically transfer the 
40 recombinant DNA. Crossway, Mol Gen. Genet, 202:179-185, 1985. The genetic material may also be 
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transferred into the plant cell by using polyethylene glycol, Krens, et a!., Nature, 296, 72-74, 1982. Another 
method of introduction of nucleic acid segments is high velocity ballistic penetration by small particles with the 
nucleic acid either within the matrix of small beads or particles, or on the surface, Klein, et al., Nature, 327, 70- 
73, 1987 and Knudsen and Muller, 1991, Planta, 185:330-336 teaching particle bombardment of barley 
5 endosperm to create transgenic barley. Yet another method of introduction would be fusion of protoplasts with 
other entities, either minicells, cells, lysosomes or other fusible lipid-surfaced bodies, Fraley, et al., Proc. NatL 
Acad. Sci USA, 19, 1859-1863,1982. 

The vector may also be introduced into the plant cells by electroporation. (Fromm et al., Proc. Natl Acad. Sci. 
USA 82:5824, 1985). In this technique, plant protoplasts are electroporated in the presence of plasmids 
10 containing the gene construct. Electrical impulses of high field strength reversibly permeabilize biomembranes 
allowing the introduction of the plasmids. Electroporated plant protoplasts reform the cell wall, divide, and form 
plant callus. 

All plants from which protoplasts can be isolated and cultured to give whole regenerated plants can be 
transformed by the present invention so that whole plants are recovered which contain the transferred gene. It is 

15 known that practically all plants can be regenerated from cultured cells or tissues, including but not limited to all 
major species of sugarcane, sugar beet, cotton, fruit and other trees, legumes and vegetables. Some suitable 
plants include, for example, species from the genera Fragaria, Lotus, Medicago, Onobrychis, Trifolium, 
Trigonella, Vigna, Citrus, Linum, Geranium, Manihot, Daucus, Arabidopsis, Brassica, Raphanus, Sinapis, 
Atropa, Capsicum, Datura, Hyoscyamus, Lycopersion, Nicotiana, Solarium, Petunia, Digitalis, Majorana, 

20 Cichorium, Helianthus, Lactuca, Bromus, Asparagus, Antirrhinum, Hererocallis, Nemesia, Pelargonium, 
Panicum, Pennisetum, Ranunculus, Senecio, Salpiglossis, Cucumis, Browaalia, Glycine, Lolium, lea, Triticum, 
Sorghum, and Datura. 

Means for regeneration vary from species to species of plants, but generally a suspension of transformed 
protoplasts containing copies of the heterologous gene is first provided. Callus tissue is formed and shoots may 

25 be induced from callus and subsequently rooted. Alternatively, embryo formation can be induced from the 
protoplast suspension. These embryos germinate as natural embryos to form plants. The culture media will 
generally contain various amino acids and hormones, such as auxin and cytokinins. It is also advantageous to 
add glutamic acid and proline to the medium, especially for such species as corn and alfalfa. Shoots and roots 
normally develop simultaneously. Efficient regeneration will depend on the medium, on the genotype, and on 

30 the history of the culture. If these three variables are controlled, then regeneration is fully reproducible and 
repeatable. 

In some plant cell culture systems, the desired protein of the invention may be excreted or alternatively, the 
protein may be extracted from the whole plant. Where the desired protein of the invention is secreted into the 
medium, it may be collected. Alternatively, the embryos and embryoless-half seeds or other plant tissue may be 
35 mechanically disrupted to release any secreted protein between cells and tissues. The mixture may be suspended 
in a buffer solution to retrieve soluble proteins. Conventional protein isolation and purification methods will be 
then used to purify the recombinant protein. Parameters of time, temperature pH, oxygen, and volumes will be 
adjusted through routine methods to optimize expression and recovery of heterologous protein. 
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iv. Bacterial Systems 

Bacterial expression techniques are known in the art. A bacterial promoter is any DNA sequence capable of 
binding bacterial RNA polymerase and initiating the downstream (3*) transcription of a coding sequence {e.g. 
structural gene) into mRNA. A promoter will have a transcription initiation region which is usually placed 
5 proximal to the 5' end of the coding sequence. This transcription initiation region usually includes an RNA 
polymerase binding site and a transcription initiation site. A bacterial promoter may also have a second domain 
called an operator, that may overlap an adjacent RNA polymerase binding site at which RNA synthesis begins. 
The operator permits negative regulated (inducible) transcription, as a gene repressor protein may bind the 
operator and thereby inhibit transcription of a specific gene. Constitutive expression may occur in the absence of 

10 negative regulatory elements, such as the operator. In addition, positive regulation may be achieved by a gene 
activator protein binding sequence, which, if present is usually proximal (5 1 ) to the RNA polymerase binding 
sequence. An example of a gene activator protein is the catabolite activator protein (CAP), which helps initiate 
transcription of the lac operon in Escherichia coli (E. coli) [Raibaud et al. (1984) Annu. Rev. Genet. 75:173], 
Regulated expression may therefore be either positive or negative, thereby either enhancing or reducing 

15 transcription. 

Sequences encoding metabolic pathway enzymes provide particularly useful promoter sequences. Examples 
include promoter sequences derived from sugar metabolizing enzymes, such as galactose, lactose (lac) [Chang et 
al (1977) Nature 795:1056], and maltose. Additional examples include promoter sequences derived from 
biosynthetic enzymes such as tryptophan (trp) [Goeddel et al. (1980) Nuc. Acids Res. 5:4057; Yelverton et al 
20 (1981) Nucl. Acids Res. 9:731; US patent 4,738,921 ; EP-A-0036776 and EP-A-0121775]. The g-laotamase (bla) 
promoter system [Weissmann (1981) "The cloning of interferon and other mistakes." In Interferon 3 (ed. L 
Gresser)], bacteriophage lambda PL [Shimatake et al. (1981) Nature 292:128] and T5 [US patent 4,689,406] 
promoter systems also provide useful promoter sequences. 

In addition, synthetic promoters which do not occur in nature also function as bacterial promoters. For example, 
25 transcription activation sequences of one bacterial or bacteriophage promoter may be joined with the operon 
sequences of another bacterial or bacteriophage promoter, creating a synthetic hybrid promoter [US 
patent 4,551,433]. For example, the tac promoter is a hybrid trp-lac promoter comprised of both trp promoter 
and lac operon sequences that is regulated by the lac repressor [Amann et al. (1983) Gene 25:167; de Boer et al 
(1983) Proc. Natl Acad. ScL 50:21]. Furthermore, a bacterial promoter can include naturally occurring 
30 promoters of non-bacterial origin that have the ability to bind bacterial RNA polymerase and initiate 
transcription. A naturally occurring promoter of non-bacterial origin can also be coupled with a compatible RNA 
polymerase to produce high levels of expression of some genes in prokaryotes. The bacteriophage T7 RNA 
polymerase/promoter system is an example of a coupled promoter system [Studier et al (1986) /. Mol Biol 
759:113; Tabor et al. (1985) Proc Natl Acad. Sci. 52:1074]. In addition, a hybrid promoter can also be 
35 comprised of a bacteriophage promoter and an E. coli operator region (EPO-A-0 267 851). 

In addition to a functioning promoter sequence, an efficient ribosorae binding site is also useful for the 
expression of foreign genes in prokaryotes. In E. coli, the ribosome binding site is called the Shine-Dalgarno 
(SD) sequence and includes an initiation codon (ATG) and a sequence 3-9 nucleotides in length located 3-11 
nucleotides upstream of the initiation codon [Shine et al (1975) Nature 254:34], The SD sequence is thought to 
40 promote binding of mRNA to the ribosome by the pairing of bases between the SD sequence and the 3' and of £. 
coli 16S rRNA [Steitzef al (1979) "Genetic signals and nucleotide sequences in messenger RNA." In Biological 



WO 02/02606 



-14- 



PCT/IB01/01445 



Regulation and Development: Gene Expression (ed. R.F. Goldberger)]. To express eukaryotic genes and 
prokaryotic genes with weak ribosome-binding site [Sambrook et al (1989) "Expression of cloned genes in 
Escherichia coli." In Molecular Cloning: A Laboratory Manual]. 

A DNA molecule may be expressed intracellularly. A promoter sequence may be directly linked with the DNA 
5 molecule, in which case the first amino acid at the N-terminus will always be a methionine, which is encoded by 
the ATG start codon. If desired, methionine at the N-terminus may be cleaved from the protein by in vitro 
incubation with cyanogen bromide or by either in vivo on in vitro incubation with a bacterial methionine N- 
terminal peptidase (EPO-A-0 219 237). 

Fusion proteins provide an alternative to direct expression. Usually, a DNA sequence encoding the N -terminal 
10 portion of an endogenous bacterial protein, or other stable protein, is fused to the 5 % end of heterologous coding 
sequences. Upon expression, this construct will provide a fusion of the two amino acid sequences. For example, 
the bacteriophage lambda cell gene can be linked at the 5' terminus of a foreign gene and expressed in bacteria. 
The resulting fusion protein preferably retains a site for a processing enzyme (factor Xa) to cleave the 
bacteriophage protein from the foreign gene [Nagai et al. (1984) Nature J09:81O]. Fusion proteins can also be 
15 made with sequences from the lad [Jia et ai (1987) Gene 60:l91] y trpE [Allen etal. (1987) J. Biotechnol. 5:93; 
Makoff et al (1989) J. Gen. Microbiol J 35:1 i], and Chey [EP-A-0 324 647] genes. The DNA sequence at the 
junction of the two amino acid sequences may or may not encode a cleavable site. Another example is a 
ubiquitin fusion protein. Such a fusion protein is made with the ubiquitin region that preferably retains a site for 
a processing enzyme {e.g. ubiquitin specific processing-protease) to cleave the ubiquitin from the foreign 
20 protein. Through this method, native foreign protein can be isolated [Miller et al (1989) Bio/Technology 7:698]. 

Alternatively, foreign proteins can also be secreted from the cell by creating chimeric DNA molecules that 
encode a fusion protein comprised of a signal peptide sequence fragment that provides for secretion of the 
foreign protein in bacteria [US patent 4,336,336]. The signal sequence fragment usually encodes a signal peptide 
comprised of hydrophobic amino acids which direct the secretion of the protein from the cell. The protein is 
25 either secreted into the growth media (gram -positive bacteria) or into the periplasmic space, located between the 
inner and outer membrane of the cell (gram -negative bacteria). Preferably there are processing sites, which can 
be cleaved either in vivo or in vitro encoded between the signal peptide fragment and the foreign gene. 

DNA encoding suitable signal sequences can be derived from genes for secreted bacterial proteins, such as the 
E. coli outer membrane protein gene (pmpA) [Masui et al. (1983), in: Experimental Manipulation of Gene 
30 Expression; Ghrayeb et al (1984) EM BO /. 5:2437] and the E. coli alkaline phosphatase signal sequence {pho/i) 
[Oka et al (1985) Proc. Natl Acad. Sci. 82:7212]. As an additional example, the signal sequence of the alpha- 
amylase gene from various Bacillus strains can be used to secrete heterologous proteins from B. subtilis [Palva 
era/. (1982) Proc. Natl Acad. Sci USA 79:5582; EP-A-0 244 042]. 

Usually, transcription termination sequences recognized by bacteria are regulatory regions located 3' to the 
35 translation stop codon, and thus together with the promoter flank the coding sequence. These sequences direct 
the transcription of an mRNA which can be translated into the polypeptide encoded by the DNA. Transcription 
termination sequences frequently include DNA sequences of about 50 nucleotides capable of forming stem loop 
structures that aid in terminating transcription. Examples include transcription termination sequences derived 
from genes with strong promoters, such as the trp gene in E. coli as well as other biosynthetic genes. 
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Usually, the above described components, comprising a promoter, signal sequence (if desired), coding sequence 
of interest, and transcription termination sequence, are put together into expression constructs. Expression 
constructs are often maintained in a replicon, such as an extrachromosomal element {e.g. plasmids) capable of 
stable maintenance in a host, such as bacteria. The replicon will have a replication system, thus allowing it to be 
5 maintained in a prokaryotic host either for expression or for cloning and amplification. In addition, a replicon 
may be either a high or low copy number plasmid. A high copy number plasmid will generally have a copy 
number ranging from about 5 to about 200, and usually about 10 to about 150. A host containing a high copy 
number plasmid will preferably contain at least about 10, and more preferably at least about 20 plasmids. Either 
a high or low copy number vector may be selected, depending upon the effect of the vector and the foreign 
10 protein on the host. 

Alternatively, the expression constructs can be integrated into the bacterial genome with an integrating vector. 
Integrating vectors usually contain at least one sequence homologous to the bacterial chromosome that allows 
the vector to integrate. Integrations appear to result from recombinations between homologous DNA in the 
vector and the bacterial chromosome. For example, integrating vectors constructed with DNA from various 
15 Bacillus strains integrate into the Bacillus chromosome (EP-A- 0 127 328). Integrating vectors may also be 
comprised of bacteriophage or transposon sequences. 

Usually, extrachromosomal and integrating expression constructs may contain selectable markers to allow for 
the selection of bacterial strains that have been transformed. Selectable markers can be expressed in the bacterial 
host and may include genes which render bacteria resistant to drugs such as ampicillin, chloramphenicol, 
20 erythromycin, kanamycin (neomycin), and tetracycline [Davies et al. (1978) Annu. Rev. Microbiol. 32:469]. 
Selectable markers may also include biosynthetic genes, such as those in the histidine, tryptophan, and leucine 
biosynthetic pathways. 

Alternatively, some of the above described components can be put together in transformation vectors. 
Transformation vectors are usually comprised of a selectable market that is either maintained in a replicon or 
25 developed into an integrating vector, as described above. 

Expression and transformation vectors, either extra-chromosomal replicons or integrating vectors, have been 
developed for transformation into many bacteria. For example, expression vectors have been developed for, inter 
alia, the following bacteria: Bacillus subtilis [Palva et al. (1982) Proc. Natl. Acad. Sci. USA 79:5582; EP-A-0 
036 259 and EP-A-0 063 953; WO 84/04541], Escherichia coli [Shimatake etai (1981) Nature 292:128; Amann 
30 et al. (1985) Gene 40:183; Studier et al. (1986) /. Mol. Biol. 789:1 13; EP-A-0 036 776,EP-A-0 136 829 and EP- 
A-0 136 907], Streptococcus cremoris [Powell et al. (1988) Appl. Environ. Microbiol. 54:655]; Streptococcus 
lividans [Powell et al. (1988) Appl Environ. Microbiol 54:655], Streptomyces lividans [US patent 4,745,056]. 

Methods of introducing exogenous DNA into bacterial hosts are well-known in the art, and usually include 
either the transformation of bacteria treated with CaCl 2 or other agents, such as divalent cations and DMSO. 

35 DNA can also be introduced into bacterial cells by electroporation. Transformation procedures usually vary with 
the bacterial species to be transformed. See e.g. [Masson et al (1989) FEMS Microbiol Lett. 60:273; Palva et ai 
(1982) Proc. Natl Acad. Sci USA 79:5582; EP-A-0 036 259 and EP-A-0 063 953; WO 84/04541, Bacillus], 
[Miller et al. (1988) Proc. Natl Acad. Sci 85:856; Wang et al (1990) J. Bacteriol 772:949, Campylobacter], 
[Cohen et al (1973) Proc. Natl Acad. Sci. 69:21 10; Dower et al (1988) Nucleic Acids Res. 76:6127;. Kushner 

40 (1978) "An improved method for transformation of Escherichia coli with ColEl-derived plasmids. In Genetic 
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Engineering: Proceedings of the International Symposium on Genetic Engineering (eds. H.W. Boyer and S. 
Nicosia); Mandel et ai (1970) J. Mol Biol 53:1 59; Taketo (1988) Biochim. Biophys. Acta 949:3 18; 
Escherichia], [Chassy el ai (1987) FEMS Microbiol Lett. 44:173 Lactobacillus]; [Fiedler et al (1988) Anal 
Biochem 770:38, Pseudomonas]; [Augustin et al (1990) FEMS Microbiol Lett. 66:203, Staphylococcus], 
5 [Barany et ai (1980) J. Bacteriol 744:698; Harlander (1987) "Transformation of Streptococcus lactis by 
electroporation, in: Streptococcal Genetics (ed. J. Ferretti and R. Curtiss III); Perry et ai (1981) Infect. Immun. 
32:1295; Powell et al (1988) Appl Environ. Microbiol. 54:655; Somkuti et al (1987) Proc. 4th Evr. Cong. 
Biotechnology 7:412, Streptococcus]. 

v. Yeast Expression 

10 Yeast expression systems are also known to one of ordinary skill in the art. A yeast promoter is any DNA 
sequence capable of binding yeast RNA polymerase and initiating the downstream (3') transcription of a coding 
sequence (e.g. structural gene) into mRNA. A promoter will have a transcription initiation region which is 
usually placed proximal to the 5' end of the coding sequence. This transcription initiation region usually includes 
an RNA polymerase binding site (the "TATA Box") and a transcription initiation site. A yeast promoter may 

15 also have a second domain called an upstream activator sequence (UAS), which, if present, is usually distal to 
the structural gene. The UAS permits regulated (inducible) expression. Constitutive expression occurs in the 
absence of a UAS. Regulated expression may be either positive or negative, thereby either enhancing or 
reducing transcription. 

Yeast is a fermenting organism with an active metabolic pathway, therefore sequences encoding enzymes in the 
20 metabolic pathway provide particularly useful promoter sequences. Examples include alcohol dehydrogenase 
(ADH) (EP-A-0 284 044), enolase, glucokinase, glucose-6-phosphate isomerase, glyceraldehyde-3-phosphate- 
dehydrogenase (GAP or GAPDH), hexokinase, phosphofructokinase, 3-phosphoglycerate mutase, and pyruvate 
kinase (PyK) (EPO-A-O 329 203). The yeast PH05 gene, encoding acid phosphatase, also provides useful 
promoter sequences [Myanohara et al (1983) Proc. Natl Acad. Sci. USA 80:1]. 

25 In addition, synthetic promoters which do not occur in nature also function as yeast promoters. For example, 
UAS sequences of one yeast promoter may be joined with the transcription activation region of another yeast 
promoter, creating a synthetic hybrid promoter. Examples of such hybrid promoters include the ADH regulatory 
sequence linked to the GAP transcription activation region (US Patent Nos. 4,876,197 and 4,880,734). Other 
examples of hybrid promoters include promoters which consist of the regulatory sequences of either the ADH2 t 

30 GAL4, GALIO, OR PH05 genes, combined with the transcriptional activation region of a glycolytic enzyme 
gene such as GAP or PyK (EP-A-0 164 556). Furthermore, a yeast promoter can include naturally occurring 
promoters of non-yeast origin that have the ability to bind yeast RNA polymerase and initiate transcription. 
Examples of such promoters include, inter alia, [Cohen et al (1980) Proc. Natl Acad. Sci. USA 77:1078; 
Henikoff et al (1981) Nature 2*5:835; Hollenberg et al (1981) Curr. Topics Microbiol Immunol 96:119; 

35 Hollenberg et ai (1979) "The Expression of Bacterial Antibiotic Resistance Genes in the Yeast Saccharomyces 
cerevisiae," in: Plasmids of Medical, Environmental and Commercial Importance (eds. K.N. Timmis and A. 
Puhler); Mercerau-Puigalon etal (1980) Gene 77:163; Panthiere/a/. (1980) Curr. Genet. 2:109;]. 

A DNA molecule may be expressed intracellular^ in yeast. A promoter sequence may be directly linked with 
the DNA molecule, in which case the first amino acid at the N-terminus of the recombinant protein will always 
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be a methionine, which is encoded by the ATG start codon. If desired, methionine at the N-terminus may be 
cleaved from the protein by in vitro incubation with cyanogen bromide. 

Fusion proteins provide an alternative for yeast expression systems, as well as in mammalian, baculovirus, and 
bacterial expression systems. Usually, a DNA sequence encoding the N-terminal portion of an endogenous yeast 
5 protein, or other stable protein, is fused to the 5' end of heterologous coding sequences. Upon expression, this 
construct will provide a fusion of the two amino acid sequences. For example, the yeast or human superoxide 
dismutase (SOD) gene, can be linked at the 5 1 terminus of a foreign gene and expressed in yeast. The DNA 
sequence at the junction of the two amino acid sequences may or may not encode a cleavable site. See e.g. EP- 
A-0 196 036. Another example is a ubiquitin fusion protein. Such a fusion protein is made with the ubiquitin 
10 region that preferably retains a site for a processing enzyme (e.g. ubiquitin-specific processing protease) to 
cleave the ubiquitin from the foreign protein. Through this method, therefore, native foreign protein can be 
isolated (e.g. W 088/024066). 

Alternatively, foreign proteins can also be secreted from the cell into the growth media by creating chimeric 
DNA molecules that encode a fusion protein comprised of a leader sequence fragment that provide for secretion 
15 in yeast of the foreign protein. Preferably, there are processing sites encoded between the leader fragment and 
the foreign gene that can be cleaved either in vivo or in vitro. The leader sequence fragment usually encodes a 
signal peptide comprised of hydrophobic amino acids which direct the secretion of the protein from the cell. 

DNA encoding suitable signal sequences can be derived from genes for secreted yeast proteins, such as the 
genes for invertase (EP-A -0012873; JPO 62,096,086) and A-factor (US patent 4,588,684). Alternatively, leaders 
20 of non-yeast origin exit, such as an interferon leader, that also provide for secretion in yeast (EP-A-0060057). 

A preferred class of secretion leaders are those that employ a fragment of the yeast alpha-factor gene, which 
contains both a "pre" signal sequence, and a "pro" region. The types of alpha-factor fragments that can be 
employed include the full-length pre-pro alpha factor leader (about 83 amino acid residues) as well as truncated 
alpha-factor leaders (usually about 25 to about 50 amino acid residues) (US Patents 4,546,083 and 4,870,008; 
25 EP-A-0 324 274). Additional leaders employing an alpha-factor leader fragment that provides for secretion 
include hybrid alpha-factor leaders made with a presequence of a first yeast, but a pro-region from a second 
yeast alphafactor. (e.g. see WO 89/02463.) 

Usually, transcription termination sequences recognized by yeast are regulatory regions located 3' to the 
translation stop codon, and thus together with the promoter flank the coding sequence. These sequences direct 
30 the transcription of an mRNA which can be translated into the polypeptide encoded by the DNA. Examples of 
transcription terminator sequence and other yeast-recognized termination sequences, such as those coding for 
glycolytic enzymes. 

Usually, the above described components, comprising a promoter, leader (if desired), coding sequence of 
interest, and transcription termination sequence, are put together into expression constructs. Expression 

35 constructs are often maintained in a replicon, such as an extrachromosomal element (e.g. plasmids) capable of 
stable maintenance in a host, such as yeast or bacteria. The replicon may have two replication systems, thus 
allowing it to be maintained, for example, in yeast for expression and in a prokaryotic host for cloning and 
amplification. Examples of such yeast-bacteria shuttle vectors include YEp24 [Botstein et ai (1979) Gene 5:17- 
24], pCl/1 [Brake et al. (1984) Proc. Natl. Acad. SciUSA 87:4642-4646], and YRpl7 [Stinchcomb et ai (1982) 

40 J. Mol. Biol 755:157]. In addition, a replicon may be either a high or low copy number plasmid. A high copy 
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number plasmid will generally have a copy number ranging from about 5 to about 200, and usually about 10 to 
about 150. A host containing a high copy number plasmid will preferably have at least about 10, and more 
preferably at least about 20. Enter a high or low copy number vector may be selected, depending upon the effect 
of the vector and the foreign protein on the host. See e.g. Brake et a/., supra. 

5 Alternatively, the expression constructs can be integrated into the yeast genome with an integrating vector. 
Integrating vectors usually contain at least one sequence homologous to a yeast chromosome that allows the 
vector to integrate, and preferably contain two homologous sequences flanking the expression construct. 
Integrations appear to result from recombinations between homologous DNA in the vector and the yeast 
chromosome [Orr-Weaver et al (1983) Methods in EnzymoL 707:228-245]. An integrating vector may be 

10 directed to a specific locus in yeast by selecting the appropriate homologous sequence for inclusion in the vector. 
See Orr-Weaver et al t supra. One or more expression construct may integrate, possibly affecting levels of 
recombinant protein produced [Rine et al (1983) Proc. Natl. Acad. Sci USA S0:6750]. The chromosomal 
sequences included in the vector can occur either as a single segment in the vector, which results in the integra- 
tion of the entire vector, or two segments homologous to adjacent segments in the chromosome and flanking the 

15 expression construct in the vector, which can result in the stable integration of only the expression construct. 

Usually, extrachromosomal and integrating expression constructs may contain selectable markers to allow for 
the selection of yeast strains that have been transformed. Selectable markers may include biosynthetic genes that 
can be expressed in the yeast host, such as ADE2, H1S4, LEU2, TRP1, and ALG7, and the G418 resistance gene, 
which confer resistance in yeast cells to tunicamycin and G418, respectively. In addition, a suitable selectable 
20 marker may also provide yeast with the ability to grow in the presence of toxic compounds, such as metal. For 
example, the presence of CUP1 allows yeast to grow in the presence of copper ions [Butt et al. (1987) 
Microbiol Rev. 57:351]. 

Alternatively, some of the above described components can be put together into transformation vectors. 
Transformation vectors are usually comprised of a selectable marker that is either maintained in a replicon or 
25 developed into an integrating vector, as described above. 

Expression and transformation vectors, either extrachromosomal replicons or integrating vectors, have been 
developed for transformation into many yeasts. For example, expression vectors have been developed for, inter 
alia, the following yeasts:Candida albicans (Kurtz, et al. (1986) Mol. Cell. Biol. 6:142], Candida maltosa 
[Kunze, et al. (1985) J. Basic Microbiol. 25:141], Hansenula polymorpha [Gleeson, et al. (1986) /. Gen. 

30 Microbiol. 752:3459; Roggenkamp et al (1986) Mol. Gen. Genet. 202:302], Kluyveromyces fragilis [Das, et al. 
(1984)7. BacterioL 755:1165], Kluyveromyces lactis [De Louvencourt et al (1983) J. Bacteriol. 754:737; Van 
den Berg et al (1990) Bio /Technology 8:135], Pichia guillerimondii [Kunze et al. (1985) /. Basic Microbiol 
25:141], Pichia pastoris [Cregg, et al (1985) Mol. Cell Biol 5:3376; US Patent Nos. 4,837,148 and 4,929,555], 
Saccharomyces cerevisiae [Hinnen et al (1978) Proc. Natl Acad. Sci. USA 75:1929; Ito et al (1983) J. 

35 Bacteriol 755:163], Schizosaccharomyces pombe [Beach and Nurse (1981) Nature 500:706], and Yarrowia 
lipolytica [Davidow, et al. (1985) Curr. Genet. 70:380471 Gaillardin, <?/al. (1985) Curr. Genet. 70:49], 

Methods of introducing exogenous DNA into yeast hosts are well-known in the art, and usually include either 
the transformation of spheroplasts or of intact yeast cells treated with alkali cations. Transformation procedures 
usually vary with the yeast species to be transformed. See e.g. [Kurtz et al. (1986) Mol. Cell Biol 6:142; Kunze 
40 et al. (1985) /. Basic Microbiol 25:141; Candida]; [Gleeson et al. (1986) /. Gen. Microbiol. 752:3459; 
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Roggenkamp et ai (1986) Mol. Gen. Genet. 202:302; Hansenula]; [Das et al. (1984) J. BacterioL 755:1 1 65; De 
Louvencourt et al. (1983) J. BacterioL 754:1 1 65; Van den Berg et ai (1990) Bio/Technology 5:135; 
Kluyveromyces]; [Cregg et al. (1985) Mol. Cell. Biol 5:3376; Kunze et al. (1985) J. Basic Microbiol. 25:141; 
US Patents 4,837,148 & 4,929,555; Pichia]; [Hinnen et al. (1978) Proc. Natl. Acad. Sci. USA 75;1929; Ito et ai 
(1983) J. BacterioL 753:163 Saccharomyces]; [Beach & Nurse (1981) Nature 300:706; Schizosaccharomyces]; 
[Davidow et al. (1985) Curr. Genet. 70:39; Gaillardin et al. (1985) Curr. Genet. 70:49; Yarrowia]. 
Pharmaceutical Compositions 

Pharmaceutical compositions can comprise polypeptides and/or nucleic acid of the invention. The 
pharmaceutical compositions will comprise a therapeutically effective amount of either polypeptides, antibodies, 
or polynucleotides of the claimed invention. 

The term "therapeutically effective amount" as used herein refers to an amount of a therapeutic agent to treat, 
ameliorate, or prevent a desired disease or condition, or to exhibit a detectable therapeutic or preventative effect. 
The effect can be detected by, for example, chemical markers or antigen levels. Therapeutic effects also include 
reduction in physical symptoms, such as decreased body temperature. The precise effective amount for a subject 
will depend upon the subject's size and health, the nature and extent of the condition, and the therapeutics or 
combination of therapeutics selected for administration. Thus, it is not useful to specify an exact effective 
amount in advance. However, the effective amount for a given situation can be determined by routine 
experimentation and is within the judgement of the clinician. 

For purposes of the present invention, an effective dose will be from about 0.01 mg/ kg to 50 mg/kg or 0.05 
mg/kg to about 10 mg/kg of the DN A constructs in the individual to which it is administered. 

A pharmaceutical composition can also contain a pharmaceutical^ acceptable carrier. The terra 
"pharmaceutical^ acceptable carrier" refers to a carrier for administration of a therapeutic agent, such as 
antibodies or a polypeptide, genes, and other therapeutic agents. The term refers to any pharmaceutical carrier 
that does not itself induce the production of antibodies harmful to the individual receiving the composition, and 
which may be administered without undue toxicity. Suitable carriers may be large, slowly metabolized 
macromolecules such as proteins, polysaccharides, polylactic acids, polyglycolic acids, polymeric amino acids, 
amino acid copolymers, and inactive virus particles. Such carriers are well known to those of ordinary skill in 
the art. 

Pharmaceutically acceptable salts can be used therein, for example, mineral acid salts such as hydrochlorides, 
hydrobromides, phosphates, sulfates, and the like; and the salts of organic acids such as acetates, propionates, 
malonates, benzoates, and the like. A thorough discussion of pharmaceutically acceptable excipients is available 
in Remington's Pharmaceutical Sciences (Mack Pub. Co., N.J. 1991). 

Pharmaceutically acceptable carriers in therapeutic compositions may contain liquids such as water, saline, 
glycerol and ethanol. Additionally, auxiliary substances, such as wetting or emulsifying agents, pH buffering 
substances, and the like, may be present in such vehicles. Typically, the therapeutic compositions are prepared as 
injectables, either as liquid solutions or suspensions; solid forms suitable for solution in, or suspension in, liquid 
vehicles prior to injection may also be prepared. Liposomes are included within the definition of a 
pharmaceutically acceptable carrier. 
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Deliverv Methods 

Once formulated, the compositions of the invention can be administered directly to the subject. The subjects to 
be treated can be animals; in particular, human subjects can be treated. 

Direct delivery of the compositions will generally be accomplished by injection, either subcutaneously, 
5 intraperitoneal^, intravenously or intramuscularly or delivered to the interstitial space of a tissue. The 
compositions can also be administered into a lesion. Other modes of administration include oral and pulmonary 
administration, suppositories, and transdermal or transcutaneous applications (e.g. see WO98/20734), needles, 
and gene guns or hyposprays. Dosage treatment may be a single dose schedule or a multiple dose schedule, 

Vaccines 

10 Vaccines according to the invention may either be prophylactic {ie. to prevent infection) or therapeutic (ie. to 
treat disease after infection). 

Such vaccines comprise immunising antigen(s), immunogen(s), polypeptide(s), protein(s) or nucleic acid, 
usually in combination with "pharmaceutical^ acceptable carriers," which include any carrier that does not itself 
induce the production of antibodies harmful to the individual receiving the composition. Suitable carriers are 

15 typically large, slowly metabolized macromolecules such as proteins, polysaccharides, polylactic acids, 
polyglycolic acids, polymeric amino acids, amino acid copolymers, lipid aggregates (such as oil droplets or 
liposomes), and inactive virus particles. Such carriers are well known to those of ordinary skill in the art. 
Additionally, these carriers may function as immunostimulating agents ("adjuvants"). Furthermore, the antigen 
or immunogen may be conjugated to a bacterial toxoid, such as a toxoid from diphtheria, tetanus, cholera, H. 

20 pylori, etc. pathogens. 

Preferred adjuvants to enhance effectiveness of the composition include, but are not limited to: (1) aluminum 
salts (alum), such as aluminum hydroxide, aluminum phosphate, aluminum sulfate, etc; (2) oil-in-water 
emulsion formulations (with or without other specific immunostimulating agents such as muramyl peptides (see 
below) or bacterial cell wall components), such as for example (a) MF59™ (WO 90/14837; Chapter 10 in 

25 Vaccine design: the subunit and adjuvant approach, eds. Powell & Newman, Plenum Press 1995), containing 
5% Squalene, 0.5% Tween 80, and 0.5% Span 85 (optionally containing various amounts of MTP-PE (see 
below), although not required) formulated into submicron particles using a microfluidizer such as Model HOY 
micro fluidizer (Microfluidics, Newton, MA), (b) SAP, containing 10% Squalane, 0.4% Tween 80, 5% pluronic- 
blocked polymer LI 21 , and thr-MDP (see below) either microfluidized into a submicron emulsion or vortexed to 

30 generate a larger particle size emulsion, and (c) Ribi™ adjuvant system (RAS), (Ribi Immunochem, Hamilton, 
MT) containing 2% Squalene, 0.2% Tween 80, and one or more bacterial cell wall components from the group 
consisting of monophosphorylipid A (MPL), trehalose dimycolate (TDM), and cell wall skeleton (CWS), 
preferably MPL + CWS (Detox™); (3) saponin adjuvants, such as Stimulon™ (Cambridge Bioscience, 
Worcester, MA) may be used or particles generated therefrom such as ISCOMs (immunostimulating 

35 complexes); (4) Complete Freund's Adjuvant (CFA) and Incomplete Freund's Adjuvant (IFA); (5) cytokines, 
such as interleukins {e.g. IL-1, IL-2, IL-4, IL-5, IL-6, IL-7, IL-12, etc.), interferons {e.g. gamma interferon), 
macrophage colony stimulating factor (M-CSF), tumor necrosis factor (TNF), etc; and (6) other substances that 
act as immunostimulating agents to enhance the effectiveness of the composition. Alum and MF59™ are 
preferred. 
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As mentioned above, muramyl peptides include, but are not limited to, N-acetyl-muramyl-L-threonyl-D- 
isoglutamine (thr-MDP), N-acetyl-normuramyl-L-alanyl-D-isogiutamine (nor-MDP), N-acetylmuramyl-L-alanyl- 
D-isoglutaminyl-L-aIanine-2-(r-2'-dipalmitoyl-j/i-gIycero-3-hydroxyphosphoryIoxy)-ethy]amine (MTP-PE), etc. 

The immunogenic compositions (e.g. the immunising antigen/immunogen/polypeptide/protein/ nucleic acid, 
5 pharmaceutical^ acceptable carrier, and adjuvant) typically will contain diluents, such as water, saline, glycerol, 
ethanol, etc. Additionally, auxiliary substances, such as wetting or emulsifying agents, pH buffering substances, 
and the like, may be present in such vehicles. 

Typically, the immunogenic compositions are prepared as injectables, either as liquid solutions or suspensions; 
solid forms suitable for solution in, or suspension in, liquid vehicles prior to injection may also be prepared. The 
10 preparation also may be emulsified or encapsulated in liposomes for enhanced adjuvant effect, as discussed 
above under pharmaceutical^ acceptable carriers. 

Immunogenic compositions used as vaccines comprise an immunologically effective amount of the antigenic or 
immunogenic polypeptides, as well as any other of the above-mentioned components, as needed. By 
"immunologically effective amount", it is meant that the administration of that amount to an individual, either in 

15 a single dose or as part of a series, is effective for treatment or prevention. This amount varies depending upon 
the health and physical condition of the individual to be treated, the taxonomic group of individual to be treated 
(e.g. nonhuman primate, primate, etc.), the capacity of the individual's immune system to synthesize antibodies, 
the degree of protection desired, the formulation of the vaccine, the treating doctor's assessment of the medical 
situation, and other relevant factors. It is expected that the amount will fall in a relatively broad range that can be 

20 determined through routine trials. 

The immunogenic compositions are conventionally administered parenterally, e.g. by injection, either subcutan- 
eously, intramuscularly, or transdermally/transcutaneously (e.g. WO98/20734). Additional formulations suitable 
for other modes of administration include oral and pulmonary formulations, suppositories, and transdermal 
applications. Dosage treatment may be a single dose schedule or a multiple dose schedule. The vaccine may be 
25 administered in conjunction with other immunoregulatory agents. 

As an alternative to protein-based vaccines, DNA vaccination may be employed [e.g. Robinson & Torres (1997) 
Seminars in Immunology 9:271-283; Donnelly et al (\991)Annu Rev Immunol 15:617-648; see later herein]. 
Gene Delivery Vehicles 

Gene therapy vehicles for delivery of constructs including a coding sequence of a therapeutic of the invention, to 
30 be delivered to the mammal for expression in the mammal, can be administered either locally or system ically. 
These constructs can utilize viral or non-viral vector approaches in in vivo or ex vivo modality. Expression of 
such coding sequence can be induced using endogenous mammalian or heterologous promoters. Expression of 
the coding sequence in vivo can be either constitutive or regulated. 

The invention includes gene delivery vehicles capable of expressing the contemplated nucleic acid sequences. 
35 The gene delivery vehicle is preferably a viral vector and, more preferably, a retroviral, adenoviral, 
adeno-associated viral (AAV), herpes viral, or alphavirus vector. The viral vector can also be an astrovirus, 
coronavirus, orthomyxovirus, papovavirus, paramyxovirus, parvovirus, picornavirus, poxvirus, or togavirus viral 
vector. See generally, Jolly (1994) Cancer Gene Therapy 1:51-64; Kimura (1994) Human Gene Therapy 
5:845-852; Connelly (1995) Human Gene Therapy 6:185-193; and Kaplitt (1994) Nature Genetics 6:148-153. 
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Retroviral vectors are well known in the art and we contemplate that any retroviral gene therapy vector is 
employable in the invention, including B, C and D type retroviruses, xenotropic retroviruses (for example, 
NZB-X1, NZB-X2 and NZB9-1 (see O'Neill (1985) /. Virol. 53:160) polytropic retroviruses e.g. MCF and 
MCF-MLV (see Kelly (1983) J. Virol 45:291), spumaviruses and lentiviruses. See RNA Tumor Viruses, 
5 Second Edition, Cold Spring Harbor Laboratory, 1985. 

Portions of the retroviral gene therapy vector may be derived from different retroviruses. For example, 
retrovector LTRs may be derived from a Murine Sarcoma Virus, a tRNA binding site from a Rous Sarcoma 
Virus, a packaging signal from a Murine Leukemia Virus, and an origin of second strand synthesis from an 
Avian Leukosis Virus. 

10 These recombinant retroviral vectors may be used to generate transduction competent retroviral vector particles 
by introducing them into appropriate packaging cell lines (see US patent 5,591,624). Retrovirus vectors can be 
constructed for site-specific integration into host cell DNA by incorporation of a chimeric integrase enzyme into 
the retroviral particle (see W096/37626). It is preferable that the recombinant viral vector is a replication 
defective recombinant virus. 

15 Packaging cell lines suitable for use with the above-described retrovirus vectors are well known in the art, are 
readily prepared (see W 095/30763 and WO92/05266), and can be used to create producer cell lines (also termed 
vector cell lines or "VCLs") for the production of recombinant vector particles. Preferably, the packaging cell 
lines are made from human parent cells (e.g. HT1080 cells) or mink parent cell lines, which eliminates 
inactivation in human serum. 

20 Preferred retroviruses for the construction of retroviral gene therapy vectors include Avian Leukosis Virus, 
Bovine Leukemia, Virus, Murine Leukemia Virus, Mink-Cell Focus-Inducing Virus, Murine Sarcoma Virus, 
Reticuloendotheliosis Virus and Rous Sarcoma Virus. Particularly preferred Murine Leukemia Viruses include 
4070A and 1504A (Hartley and Rowe (1976) / Virol 19:19-25), Abelson (ATCC No. VR-999), Friend (ATCC 
No. VR-245), Graffi, Gross (ATCC Nol VR-590), Kirsten, Harvey Sarcoma Virus and Rauscher (ATCC No. 

25 VR-998) and Moloney Murine Leukemia Virus (ATCC No. VR-190). Such retroviruses may be obtained from 
depositories or collections such as the American Type Culture Collection ("ATCC") in Rockville, Maryland or 
isolated from known sources using commonly available techniques. 

Exemplary known retroviral gene therapy vectors employable in this invention include those described in patent 
applications GB2200651, EP0415731, EP0345242, EP0334301, WO89/02468; WO89/05349, WO89/09271, 

30 WO90/02806, WO90/07936, WO94/03622, W093/25698, W093/25234, WO93/11230, WO93/10218, 
WO91/02805, WO91/02825, WO95/07994, US 5,219,740, US 4,405,712, US 4,861,719, US 4,980,289, US 
4,777,127, US 5,591,624. See also Vile (1993) Cancer Res 53:3860-3864; Vile (1993) Cancer Res 53:962-967; 
Ram (1993) Cancer Res 53 (1993) 83-88; Takamiya (1992) J Neurosci Res 33:493-503; Baba (1993) J 
Neurosurs 79:729-735; Mann (1983) Cell 33:153; Cane (1984) Proc Natl Acad Sci 81:6349; and Miller (1990) 

35 Human Gene Therapy 1. 

Human adenoviral gene therapy vectors are also known in the art and employable in this invention. See, for 
example, Berkner (1988) Biotechniques 6:616 and Rosenfeld (1991) Science 252:431, and WO93/07283, 
W 093/06223, and W 093/07282. Exemplary known adenoviral gene therapy vectors employable in this 
invention include those described in the above referenced documents and in W094/12649, WO93/03769, 
40 W093/19191, W094/28938, W095/11984, WO95/00655, WO95/27071, W095/29993, W095/34671, 
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WO96/05320, WO94/08026, WO94/11506, WO93/06223, W094/24299, WO95/14102, W095/24297, 
WO95/02697, W094/28I52, W094/24299, WO95/09241, WO95/25807, WO95/05835, W094/18922 and 
WO95/09654. Alternatively, administration of DNA linked to killed adenovirus as described in Curiel (1992) 
Hum. Gene Ther. 3:147-154 may be employed. The gene delivery vehicles of the invention also include 
5 adenovirus associated virus (AAV) vectors. Leading and preferred examples of such vectors for use in this 
invention are the AAV-2 based vectors disclosed in Srivastava, WO93/09239. Most preferred AAV vectors 
comprise the two AAV inverted terminal repeats in which the native D-sequences are modified by substitution 
of nucleotides, such that at least 5 native nucleotides and up to 18 native nucleotides, preferably at least 10 
native nucleotides up to 18 native nucleotides, most preferably 10 native nucleotides are retained and the 

10 remaining nucleotides of the D-sequence are deleted or replaced with non-native nucleotides. The native 
D-sequences of the AAV inverted terminal repeats are sequences of 20 consecutive nucleotides in each AAV 
inverted terminal repeat (ie. there is one sequence at each end) which are not involved in HP formation. The 
non-native replacement nucleotide may be any nucleotide other than the nucleotide found in the native 
D-sequence in the same position. Other employable exemplary AAV vectors are pWP-19, pWN-1, both of 

15 which are disclosed in Nahreini (1993) Gene 124:257-262. Another example of such an AAV vector is psub201 
(see Samulski (1987) J. Virol 61:3096). Another exemplary AAV vector is the Double-D ITR vector. 
Construction of the Double-D ITR vector is disclosed in US Patent 5,478,745, Still other vectors are those 
disclosed in Carter US Patent 4,797,368 and Muzyczka US Patent 5,139,941, Chartejee US Patent 5,474,935, 
and Kotin W094/288157. Yet a further example of an AAV vector employable in this invention is 

20 SSV9AFABTKneo, which contains the AFP enhancer and albumin promoter and directs expression 
predominantly in the liver. Its structure and construction are disclosed in Su (1996) Human Gene Therapy 
7:463-470. Additional AAV gene therapy vectors are described in US 5,354,678, US 5,173,414, US 5,139,941, 
and US 5,252,479. 

The gene therapy vectors of the invention also include herpes vectors. Leading and preferred examples are 
25 herpes simplex virus vectors containing a sequence encoding a thymidine kinase polypeptide such as those 
disclosed in US 5,288,641 and EP0176170 (Roizman). Additional exemplary herpes simplex virus vectors 
include HFEM/ICP6-LacZ disclosed in WO95/04139 (Wistar), pHSVlac described in Geller (1988) Science 
241:1667-1669 and in WO90/09441 & WO92/07945, HSV Us3::pgC-IacZ described in Fink (1992) Human 
Gene Therapy 3:11-19 and HSV 7134, 2 RH 105 and GAL4 described in EP 0453242 (Breakefield), and those 
30 deposited with ATCC as accession numbers ATCC VR-977 and ATCC VR-260. 

Also contemplated are alpha virus gene therapy vectors that can be employed in this invention. Preferred alpha 
virus vectors are Sindbis viruses vectors. Togaviruses, Semliki Forest virus (ATCC VR-67; ATCC VR-1247), 
Middleberg virus (ATCC VR-370), Ross River virus (ATCC VR-373; ATCC VR-1246), Venezuelan equine 
encephalitis virus (ATCC VR923; ATCC VR-1250; ATCC VR-1249; ATCC VR-532), and those described in 
35 US patents 5,091,309, 5,217,879, and WO92/10578. More particularly, those alpha virus vectors described in 
US Serial No. 08/405,627, filed March 15, 1995,W094/21792, WO92/10578, WO95/07994, US 5,091,309 and 
US 5,217,879 are employable. Such alpha viruses may be obtained from depositories or collections such as the 
ATCC in Rockville, Maryland or isolated from known sources using commonly available techniques. 
Preferably, alphavirus vectors with reduced cytotoxicity are used (see USSN 08/679640). 

40 DNA vector systems such as eukaryotic layered expression systems are also useful for expressing the nucleic 
acids of the invention. See WO95/07994 for a detailed description of eukaryotic layered expression systems. 



WO 02/02606 



-24- 



PCT/IB01/01445 



Preferably, the eukaryotic layered expression systems of the invention are derived from alphavirus vectors and 
most preferably from Sindbis viral vectors. 

Other viral vectors suitable for use in the present invention include those derived from polio virus, for example 
ATCC VR-58 and those described in Evans, Nature 339 (1989) 385 and Sabin (1973) J. Biol Standardization 
5 1:115; rhinovirus, for example ATCC VR-1110 and those described in Arnold (1990) J Cell Biochem L401; pox 
.viruses such as canary pox virus or vaccinia virus, for example ATCC VR-ltl and ATCC VR-2010 and those 
described in Fisher-Hoch (1989) Proc Natl Acad Sci 86:317; Flexner (1989) Ann NY Acad Sci 569:86, Flexner 
(1990) Vaccine 8:17; in US 4,603,112 and US 4,769,330 and WO89/01973; SV40 virus, for example ATCC 
VR-305 and those described in Mulligan (1979) Nature 277:108 and Madzak (1992) J Gen Virol 73:1533; 

10 influenza virus, for example ATCC VR-797 and recombinant influenza viruses made employing reverse genetics 
techniques as described in US 5,166,057 and in Enami (1990) Proc Natl Acad Sci 87:3802-3805; Enami & 
Palese (1991) J Virol 65:2711-2713 and Luytjes (1989) Cell 59:110, (see also McMichael (1983) NEJ Med 
309:13, and Yap (1978) Nature 273:238 and Nature (1979) 277:108); human immunodeficiency virus as 
described in EP-0386882 and in Buchschacher (1992) /. Virol 66:2731; measles virus, for example ATCC 

15 VR-67 and VR-1247 and those described in EP-0440219; Aura virus, for example ATCC VR-368; Bebaru virus, 
for example ATCC VR-600 and ATCC VR-1240; Cabassou virus, for example ATCC VR-922; Chikungunya 
virus, for example ATCC VR-64 and ATCC VR-1241; Fort Morgan Virus, for example ATCC VR-924; Getah 
virus, for example ATCC VR-369 and ATCC VR-1243; Kyzylagach virus, for example ATCC VR-927; Mayaro 
virus, for example ATCC VR-66; Mucambo virus, for example ATCC VR-580 and ATCC VR-1244; Ndumu 

20 virus, for example ATCC VR-371; Pixuna virus, for example ATCC VR-372 and ATCC VR-1245; Tonate 
virus, for example ATCC VR-925; Triniti virus, for example ATCC VR-469; Una virus, for example ATCC 
VR-374; Whataroa virus, for example ATCC VR-926; Y -62-33 virus, for example ATCC VR-375; O'Nyong 
virus, Eastern encephalitis virus, for example ATCC VR-65 and ATCC VR-1242; Western encephalitis virus, 
for example ATCC VR-70, ATCC VR-1251, ATCC VR-622 and ATCC VR-1252; and coronavirus, for 

25 example ATCC VR-740 and those described in Hamre (1966) Proc Soc Exp Biol Med 121:190. 

Delivery of the compositions of this invention into cells is not limited to the above mentioned viral vectors. 
Other delivery methods and media may be employed such as, for example, nucleic acid expression vectors, 
polycationic condensed DNA linked or unlinked to killed adenovirus alone, for example see US Serial No. 
08/366,787, filed December 30, 1994 and Curiel (1992) Hum Gene Ther 3:147-154 ligand linked DNA, for 

30 example see Wu (1989) / Biol Chem 264:16985-16987, eucaryotic cell delivery vehicles cells, for example see 
US Serial No.08/240,030, filed May 9, 1994, and US Serial No. 08/404,796, deposition of photopolyraerized 
hydrogel materials, hand-held gene transfer particle gun, as described in US Patent 5,149,655, ionizing radiation 
as described in US5,206,152 and in WO92/11033, nucleic charge neutralization or fusion with cell membranes. 
Additional approaches are described in Philip (1994) Mol Cell Biol 14:241 1-2418 and in Woffendin (1994) Proc 

35 Natl Acad Sci 9\:\5Z\-I5i5. 

Particle mediated gene transfer may be employed, for example see US Serial No. 60/023,867. Briefly, the 
sequence can be inserted into conventional vectors that contain conventional control sequences for high level 
expression, and then incubated with synthetic gene transfer molecules such as polymeric DNA-binding cations 
like polylysine, protamine, and albumin, linked to cell targeting ligands such as asialoorosomucoid, as described 
40 in Wu & Wu (1987) J. Biol Chem. 262:4429-4432, insulin as described in Hucked (1990) Biochem Pharmacol 
40:253-263, galactose as described in Plank (1992) Bioconjugate Chem 3:533-539, lactose or transferrin. 
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Naked DNA may also be employed. Exemplary naked DNA introduction methods are described in WO90/11092 
and US 5,580,859. Uptake efficiency may be improved using biodegradable latex beads. DNA coated latex 
beads are efficiently transported into cells after endocytosis initiation by the beads. The method may be 
improved further by treatment of the beads to increase hydrophobicity and thereby facilitate disruption of the 
5 endosome and release of the DNA into the cytoplasm. 

Liposomes that can act as gene delivery vehicles are described in US 5,422,120, W095/13796, W 094/23697, 
W091/14445 and EP-524,968. As described in USSN. 60/023,867, on non-viral delivery, the nucleic acid 
sequences encoding a polypeptide can be inserted into conventional vectors that contain conventional control 
sequences for high level expression, and then be incubated with synthetic gene transfer molecules such as 

10 polymeric DNA-binding cations like polylysine, protamine, and albumin, linked to cell targeting Iigands such as 
asialoorosomucoid, insulin, galactose, lactose, or transferrin. Other delivery systems include the use of 
liposomes to encapsulate DNA comprising the gene under the control of a variety of tissue-specific or 
ubiquitously-active promoters. Further rion-viral delivery suitable for use includes mechanical delivery systems 
such as the approach described in Woffendin et al (1994) Proc. Natl. Acad. ScL USA 91(24):! 1581-1 1585. 

15 Moreover, the coding sequence and the product of expression of such can be delivered through deposition of 
photopolymerized hydrogel materials. Other conventional methods for gene delivery that can be used for 
delivery of the coding sequence include, for example, use of hand-held gene transfer particle gun, as described in 
US 5,149,655; use of ionizing radiation for activating transferred gene, as described in US 5,206,152 and 
WO92/1 1033 

20 Exemplary liposome and polycationic gene delivery vehicles are those described in US 5,422,120 and 
4,762,915; in WO 95/13796; W094/23697; and W091/14445; in EP-0524968; and in Stryer, Biochemistry, 
pages 236-240 (1975) W.H. Freeman, San Francisco; Szoka (1980) Biochem Biophys Acta 600:1; Bayer (1979) 
Biochem Biophys Acta 550:464; Rivnay (1987) Meth Enzymol 149:119; Wang (1987) Proc Natl Acad Sci 
84:7851; Plant (1989) AnalBiochem 176:420. 

25 A polynucleotide composition can comprises therapeutically effective amount of a gene therapy vehicle, as the 
term is defined above. For purposes of the present invention, an effective dose will be from about 0.01 mg/ kg to 
50 mg/kg or 0.05 mg/kg to about 10 mg/kg of the DNA constructs in the individual to which it is administered. 

Delivery Methods 

Once formulated, the polynucleotide compositions of the invention can be administered (1) directly to the 
30 subject; (2) delivered ex vivo, to cells derived from the subject; or (3) in vitro for recombinant protein 
expression. The subjects to be treated can be mammals or birds. Also, human subjects can be treated. 

Direct delivery of the compositions will generally be accomplished by injection, either subcutaneously, 
intraperitoneal^, intravenously or intramuscularly or delivered to the interstitial space of a tissue. The 
compositions can also be administered into a lesion. Other modes of administration include oral and pulmonary 
35 administration, suppositories, and transdermal or transcutaneous applications (e.g. see WO98/20734), needles, 
and gene guns or hyposprays. Dosage treatment may be a single dose schedule or a multiple dose schedule. 

Methods for the ex vivo delivery and reimplantation of transformed cells into a subject are known in the art and 
described in e.g. W093/14778. Examples of cells useful in ex vivo applications include, for example, stem cells, 
particularly hematopoetic, lymph cells, macrophages, dendritic cells, or tumor cells. 
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Generally, delivery of nucleic acids for both ex vivo and in vitro applications can be accomplished by the 
following procedures, for example, dextran-mediated transfection, calcium phosphate precipitation, polybrene 
mediated transfection, protoplast fusion, electroporation, encapsulation of the polynucleotide(s) in liposomes, 
and direct microinjection of the DNA into nuclei, all well known in the art. 
5 Polynucleotide and polypeptide pharmaceutical compositions 

In addition to the pharmaceutical^ acceptable carriers and salts described above, the following additional agents 
can be used with polynucleotide and/or polypeptide compositions. 

A, Polypeptides 

One example are polypeptides which include, without limitation: asioloorosomucoid (ASOR); transferrin; 
10 asialoglycoproteins; antibodies; antibody fragments; ferritin; interleukins; interferons, granulocyte, macrophage 
colony stimulating factor (GM-CSF), granulocyte colony stimulating factor (G-CSF), macrophage colony 
stimulating factor (M-CSF), stem cell factor and erythropoietin. Viral antigens, such as envelope proteins, can 
also be used. Also, proteins from other invasive organisms, such as the 17 amino acid peptide from the 
circumsporozoite protein of Plasmodium falciparum known as RII. 

15 B .Hormones. Vitamins, etc. 

Other groups that can be included are, for example: hormones, steroids, androgens, estrogens, thyroid hormone, 
or vitamins, folic acid. 

C. PoIvalkvlenes. Polysaccharides, etc. 

Also, polyalkylene glycol can be included with the desired polynucleotides/polypeptides. In a preferred 
20 embodiment, the polyalkylene glycol is polyethlylene glycol. In addition, mono-, di-, or polysaccharides can be 
included. In a preferred embodiment of this aspect, the polysaccharide is dextran or DEAE-dextran. Also, 
chitosan and poly(lactide-co-glycolide) 

D. Lipids. and Liposomes 

The desired polynucleotide/polypeptide can also be encapsulated in lipids or packaged in liposomes prior to 
25 delivery to the subject or to cells derived therefrom. 

Lipid encapsulation is generally accomplished using liposomes which are able to stably bind or entrap and retain 
nucleic acid. The ratio of condensed polynucleotide to lipid preparation can vary but will generally- be around 
1:1 (mg DNA:micro'moles lipid), or more of lipid. For a review of the use of liposomes as carriers for delivery of 
nucleic acids, see, Hug and Sleight (1991) Biochim. Biophys. Acta, 1097:1-17; Straubinger (1983) Meth. 
30 Enzymol. 101:512-527. 

Liposomal preparations for use in the present invention include cationic (positively charged), anionic (negatively 
charged) and neutral preparations. Cationic liposomes have been shown to mediate intracellular delivery of 
plasmid DNA (Feigner (1987) Proc. Natl Acad. Sci. USA 84:7413-7416); mRNA (M alone (1989) Proc. Natl. 
Acad. Sci. USA 86:6077-6081); and purified transcription factors (Debs (1990) J. Biol. Chem. 
35 265:10189-10192), in functional form. 

Cationic liposomes are readily available. For example, N[l-2,3-dioleyloxy)propyl]-N,N,N-triethylammonium 
(DOTMA) liposomes are available under the trademark Lipofectin, from GIBCO BRL, Grand Island, NY. (See, 
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also, Feigner supra). Other commercially available liposomes include transfectace (DDAB/DOPB) and 
DOTAP/DOPE (Boerhinger). Other cationic liposomes can be prepared from readily available materials using 
techniques well known in the art See, e.g. Szoka (1978) Proc. Natl Acad. Sci USA 75:4194-4198; 
WO90/1 1092 for a description of the synthesis of DOTAP (],2-bis(oleoyloxy)-3-(trimethyIammonio)propane) 
5 liposomes. 

Similarly, anionic and neutral liposomes are readily available, such as from Avanti Polar Lipids (Birmingham, 
AL), or can be easily prepared using readily available materials. Such materials include phosphatidyl choline, 
cholesterol, phosphatidyl ethanolamine, dioleoylphosphatidyl choline (DOPC), dioleoyiphosphatidyl glycerol 
(DOPG), dioleoylphoshatidyl ethanolamine (DOPE), among others. These materials can also be mixed with the 
10 DOTMA and DOTAP starting materials in appropriate ratios. Methods for making liposomes using these 
materials are well known in the art. 

The liposomes can comprise multilammelar vesicles (MLVs), small unilamellar vesicles (SUVs), or large 
unilamellar vesicles (LUVs). The various liposome-nucleic acid complexes are prepared using methods known 
in the art. See e.%. Straubinger (1983) Meth. Immunol 101:512-527; Szoka (1978) Proc. Natl Acad. Sci. USA 
15 75:4194-4198; Papahadjopoulos (1975) Biocliim. Biophys. Acta 394:483; Wilson (1979) Cell 17:77); Deamer & 
Bangham (1976) Biochim. Biophys. Acta 443:629; Ostro (1977) Biochem. Biophys. Res. Commun. 76:836; 
Fraley (1979) Proc. Natl Acad. Sci. USA 76:3348); Enoch & Strittmatter (1979) Proc. Natl Acad. Sci. USA 
76:145; Fraley (1980)/. Biol. Chem. (1980) 255:10431; Szoka & Papahadjopoulos (1978) Proc. Natl Acad. Sci. 
USA 75:145; and Schaefer-Ridder (1982) Science 215:166. 

20 E .Lipoproteins 

In addition, lipoproteins can be included with the polynucleotide/polypeptide to be delivered. Examples of 
lipoproteins to be utilized include: chylomicrons, HDL, IDL, LDL, and VLDL. Mutants, fragments, or fusions 
of these proteins can also be used. Also, modifications of naturally occurring lipoproteins can be used, such as 
acetylated LDL. These lipoproteins can target the delivery of polynucleotides to cells expressing lipoprotein 
25 receptors. Preferably, if lipoproteins are including with the polynucleotide to be delivered, no other targeting 
ligand is included in the composition. 

Naturally occurring lipoproteins comprise a lipid and a protein portion. The protein portion are known as 
apoproteins. At the present, apoproteins A, B, C, D, and E have been isolated and identified. At least two of 
these contain several proteins, designated by Roman numerals, AI, All, AIV; CI, CII, CHI. 

30 A lipoprotein can comprise more than one apoprotein. For example, naturally occurring chylomicrons comprises 
of A, B, C, & E, over time these lipoproteins lose A and acquire C and E apoproteins. VLDL comprises A, B, C, 
& E apoproteins, LDL comprises apoprotein B; HDL comprises apoproteins A, C, & E. 

The amino acid of these apoproteins are known and are described in, for example, Breslow (1985) Annu Rev. 
Biochem 54:699; Law (1986) Adv. Exp Med. Biol. 151:162; Chen (1986) J Biol Chem 261:12918; Kane (1980) 
35 Proc Natl Acad Sci USA 77:2465; and Utermann (1984) Hum Genet 65:232. 

Lipoproteins contain a variety of lipids including, triglycerides, cholesterol (free and esters), and phospholipids. 
The composition of the lipids varies in naturally occurring lipoproteins. For example, chylomicrons comprise 
mainly triglycerides. A more detailed description of the lipid content of naturally occurring lipoproteins can be 
found, for example, in Meth. Enzymol 128 (1986). The composition of the lipids are chosen to aid in 
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conformation of the apoprotein for receptor binding activity. The composition of lipids can also be chosen to 
facilitate hydrophobic interaction and association with the polynucleotide binding molecule. 

Naturally occurring lipoproteins can be isolated from serum by ultracentrifugation, for instance. Such methods 
are described in Meth. EnzymoL (supra)] Pitas (1980) J. Biochem. 255:5454-5460 and Mahey (1979) J Clin. 
5 Invest 64:743-750. Lipoproteins can also be produced by in vitro or recombinant methods by expression of the 
apoprotein genes in a desired host cell. See, for example, Atkinson (1986) Annu Rev Biophys Chem 15:403 and 
Radding (1958) Biochim Biophys Ada 30: 443. Lipoproteins can also be purchased from commercial suppliers, 
such as Biomedical Techniologies, Inc., Stoughton, Massachusetts, USA. Further description of lipoproteins can 
be found in Zuckermann et ai PCT/US97/14465. 

10 F.Polvcationic Agents 

Polycationic agents can be included, with or without lipoprotein, in a composition with the desired 
polynucleotide/polypeptide to be delivered. 

Polycationic agents, typically, exhibit a net positive charge at physiological relevant pH and are capable of 
neutralizing the electrical charge of nucleic acids to facilitate delivery to a desired location. These agents have 
15 both in vitro, ex vivo, and in vivo applications. Polycationic agents can be used to deliver nucleic acids to a 
living subject either intramuscularly, subcutaneously, etc. 

The following are examples of useful polypeptides as polycationic agents: polylysine, polyarginine, 
polyornithine, and protamine. Other examples include histones, protamines, human serum albumin, DNA 
binding proteins, non-histone chromosomal proteins, coat proteins from DNA viruses, such as (X174, 
20 transcriptional factors also contain domains that bind DNA and therefore may be useful as nucleic aid 
condensing agents. Briefly, transcriptional factors such as C/CEBP, c-jun, c-fos, AP-1, AP-2, AP-3, CPF, Prot-1, 
Sp-1 , Oct-1, Oct-2, CREP, and TFIID contain basic domains that bind DNA sequences. 

Organic polycationic agents include: spermine, spermidine, and purtrescine. 

The dimensions and of the physical properties of a polycationic agent can be extrapolated from the list above, to 
25 construct other polypeptide polycationic agents or to produce synthetic polycationic agents. 

Synthetic polycationic agents which are useful include, for example, DEAE-dextran, polybrene. Lipofectin™, 
and lipofectAMINE™ are monomers that form polycationic complexes when combined with 
polynucleotides/polypeptides. 
Nucleic Acid Hybridisation 

30 "Hybridization" refers to the association of two nucleic acid sequences to one another by hydrogen bonding. 
Typically, one sequence will be fixed to a solid support and the other will be free in solution. Then, the two 
sequences will be placed in contact with one another under conditions that favor hydrogen bonding. Factors that 
affect this bonding include: the type and volume of solvent; reaction temperature; time of hybridization; 
agitation; agents to block the non-specific attachment of the liquid phase sequence to the solid support 

35 (Denhardt's reagent or BLOTTO); concentration of the sequences; use of compounds to increase the rate of 
association of sequences (dextran sulfate or polyethylene glycol); and the stringency of the washing conditions 
following hybridization. See Sambrook et ai [supra] vol.2, chapt.9, pp.9 .47 to 9.57. 
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"Stringency" refers to conditions in a hybridization reaction that favor association of very similar sequences over 
sequences that differ. For example, the combination of temperature and salt concentration should be chosen that 
is approximately 120 to 200°C below the calculated Tm of the hybrid under study. The temperature and salt 
conditions can often be determined empirically in preliminary experiments in which samples of genomic DNA 
5 immobilized on filters are hybridized to the sequence of interest and then washed under conditions of different 
stringencies. See Sambrook et al at page 9.50. 

Variables to consider when performing, for example, a Southern blot are (1) the complexity of the DNA being 
blotted and (2) the homology between the probe and the sequences being detected. The total amount of the 
fragment(s) to be studied can vary a magnitude of 10, from 0.1 to lpg for a plasmid or phage digest to 10" 9 to 

10 10* 8 g for a single copy gene in a highly complex eukaryotic genome. For lower complexity polynucleotides, 
substantially shorter blotting, hybridization, and exposure times, a smaller amount of starting polynucleotides, 
and lower specific activity of probes can be used. For example, a single-copy yeast gene can be detected with an 
exposure time of only 1 hour starting with 1 pg of yeast DNA, blotting for two hours, and hybridizing for 4-8 
hours with a probe of 10 8 cpm/pg. For a single-copy mammalian gene a conservative approach would start with 

15 10 pg of DNA, blot overnight, and hybridize overnight in the presence of 10% dextran sulfate using a probe of 
greater than 10 s cpm/pg, resulting in an exposure time of -24 hours. 

Several factors can affect the melting temperature (Tm) of a DNA-DNA hybrid between the probe and the 
fragment of interest, and consequently, the appropriate conditions for hybridization and washing. In many cases 
the probe is not 100% homologous to the fragment. Other commonly encountered variables include the length 
20 and total G+C content of the hybridizing sequences and the ionic strength and formamide content of the 
hybridization buffer. The effects of all of these factors can be approximated by a single equation: 

Tm= 81 + 16.6(logi 0 Ci)+ 0.4[%(G + C)]-0.6(%formamide) - 600/n-1.5(%mismatch). 

where Ci is the salt concentration (monovalent ions) and n is the length of the hybrid in base pairs (slightly 
modified from Meinkoth & Wah) (1984) Anal. Biochem. 138: 267-284). 

25 In designing a hybridization experiment, some factors affecting nucleic acid hybridization can be conveniently 
altered. The temperature of the hybridization and washes and the salt concentration during the washes are the 
simplest to adjust. As the temperature of the hybridization increases {ie. stringency), it becomes less likely for 
hybridization to occur between strands that are nonhomologous, and as a result, background decreases. If the 
radiolabeled probe is not completely homologous with the immobilized fragment (as is frequently the case in 

30 gene family and interspecies hybridization experiments), the hybridization temperature must be reduced, and 
background will increase. The temperature of the washes affects the intensity of the hybridizing band and the 
degree of background in a similar manner. The stringency of the washes is also increased with decreasing salt 
concentrations. 

In general, convenient hybridization temperatures in the presence of 50% formamide are 42°C for a probe with 
35 is 95% to 100% homologous to the target fragment, 37°C for 90% to 95% homology, and 32°C for 85% to 90% 
homology. For lower homologies, formamide content should be lowered and temperature adjusted accordingly, 
using the equation above. If the homology between the probe and the target fragment are not known, the 
simplest approach is to start with both hybridization and wash conditions which are nonstringent. If non-specific 
bands or high background are observed after autoradiography, the filter can be washed at high stringency and 
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•reexposed. If the time required for exposure makes this approach impractical, several hybridization and/or 
washing stringencies should be tested in parallel. 
Nucleic Acid Probe Assays 

Methods such as PCR, branched DNA probe assays, or blotting techniques utilizing nucleic acid probes 
5 according to the invention can determine the presence of cDNA or mRNA. A probe is said to "hybridize" with a 
sequence of the invention if it can form a duplex or double stranded complex, which is stable enough to be 
detected. 

The nucleic acid probes will hybridize to the Chlamydial nucleotide sequences of the invention (including both 
sense and antisense strands). Though many different nucleotide sequences will encode the amino acid sequence, 
10 the native Chlamydial sequence is preferred because it is the actual sequence present in cells. mRNA represents 
a coding sequence and so a probe should be complementary to the coding sequence; single-stranded cDNA is 
complementary to mRNA, and so a cDNA probe should be complementary to the non-coding sequence. 

The probe sequence need not be identical to the Chlamydial sequence (or its complement) — some variation in 
the sequence and length can lead to increased assay sensitivity if the nucleic acid probe can form a duplex with 

15 target nucleotides, which can be detected. Also, the nucleic acid probe can include additional nucleotides to 
stabilize the formed duplex. Additional Chlamydial sequence may also be helpful as a label to detect the formed 
duplex. For example, a non-complementary nucleotide sequence may be attached to the 5' end of the probe, with 
the remainder of the probe sequence being complementary to a Chlamydial sequence. Alternatively, 
non-complementary bases or longer sequences can be interspersed into the probe, provided that the probe 

20 sequence has sufficient complementarity with the a Chlamydial sequence in order to hybridize therewith and 
thereby form a duplex which can be detected. 

The exact length and sequence of the probe will depend on the hybridization conditions, such as temperature, 
salt condition and the like. For example, for diagnostic applications, depending on the complexity of the analyte 
sequence, the nucleic acid probe typically contains at least 10-20 nucleotides, preferably 15-25, and more 
25 preferably >30 nucleotides, although it may be shorter than this. Short primers generally require cooler 
temperatures to form sufficiently stable hybrid complexes with the template. 

Probes may be produced by synthetic procedures, such as the triester method of Matteucci et al [J. Am. Chem. 
Soc. (1981) 103:3185], or according to Urdea et al [Proc. Natl Acad. Sci. USA (1983) 80: 7461], or using 
commercially available automated oligonucleotide synthesizers. 

30 The chemical nature of the probe can be selected according to preference. For certain applications, DNA or 
RNA are appropriate. For other applications, modifications may be incorporated e.g. backbone modifications, 
such as phosphorothioates or methylphosphonates, can be used to increase in vivo half-life, alter RNA affinity, 
increase nuclease resistance etc. [e.g. see Agrawal & Iyer (1995) Curr Opin Biotechnol 6:12-19; Agrawal (1996) 
TIBTECH 14:376-387]; analogues such as peptide nucleic acids may also be used [e.g. see Corey (1997) 

35 .TIBTECH 15:224-229; Buchardt et al (1993) TIBTECH 1 1:384-386]. 

Alternatively, the polymerase chain reaction (PCR) is another well-known means for detecting small amounts of 
target nucleic acids. The assay is described in: Mullis et al [Meth. Enzymol (1987) 155: 335-350]; US patents 
4,683,195 & 4,683,202. Two 'primers' hybridize with the target nucleic acids and are used to prime the reaction. 
The primers can comprise sequence that does not hybridize to the sequence of the amplification target (or its 
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complemcnt) to aid with duplex stability or, for example, to incorporate a convenient restriction site. Typically, 
such sequence will flank the desired Chlamydial sequence. 

A thermostable polymerase creates copies of target nucleic acids from the primers using the original target 
nucleic acids as a template. After a threshold amount of target nucleic acids are generated by the polymerase, 
5 they can be detected by more traditional methods, such as Southern blots. When using the Southern blot method, 
the labelled probe will hybridize to the Chlamydial sequence (or its complement). 

Also, mRNA or cDNA can be detected by traditional blotting techniques described in Sambrook et al [supra]. 
mRNA, or cDNA generated from mRNA using a polymerase enzyme, can be purified and separated using gel 
electrophoresis. The nucleic acids on the gel are then blotted onto a solid support, such as nitrocellulose. The 
10 solid support is exposed to a labelled probe and then washed to remove any unhybridized probe. Next, the 
duplexes containing the labeled probe are detected. Typically, the probe is labelled with a radioactive moiety. 

BRIEF DESCRIPTION OF THE DRAWINGS 
Figures 1-189 show data pertaining to examples 1-189. 
Figure 190 shows a representative 2D gel of proteins in elementary bodies. 
15 Figure 191 shows an alignment of sequences in five (six) proteins of the invention. 

EXAMPLES 

The examples indicate C.pneumoniae proteins, together with evidence to support the view that the 
proteins are useful antigens for vaccine production and development or for diagnostic purposes. This 
evidence takes the form of: 

20 • Computer prediction based on sequence information from CWL029 strain {e.g. using the 

PSORT algorithm available from www.psort.nibb.ac.jp). 

• Data on recombinant expression and purification of the proteins cloned from IOL207 strain. 

• Western blots to demonstrate immunoreactivity in serum (typically a blot of an EB extract of 
C.pneumoniae strain FB/96 stained with mouse antiserum against the recombinant protein). 

25 • FACS analysis of C.pneumoniae bacteria or purified EBs to confirm accessibility of the 

antigen to the immune system (see also table HI). 

• An indication if the protein was identified by MALDI-TOF from a 2D gel electrophoresis 
map of proteins from purified elementary bodies from strain FB/96. This confirms that the 
protein is expressed in vivo (see also table V). 

30 Various tests can be used to assess the in vivo immunogenicity of the proteins identified in the 
examples. For example, the proteins can be expressed recombinantly and used to screen patient sera 
by immunoblot. A positive reaction between the protein and patient serum indicates that the patient 
has previously mounted an immune response to the protein in question ie. the protein is an 
immunogen. This method can also be used to identify immunodominant proteins. 
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The recombinant protein can also be conveniently used to prepare antibodies e.g. in a mouse. These 
can be used for direct confirmation that a protein is located on the cell-surface. Labelled antibody 
(e.g. fluorescent labelling for FACS) can be incubated with intact bacteria and the presence of label 
on the bacterial surface confirms the location of the protein. 

5 In particular, the following methods (A) to (O) were used to express, purify and biochemically 
characterise the proteins of the invention: 

CLONING OF CPN ORFs FOR EXPRESSION IN E.COLI 

ORFs of Chlamydia pneumoniae (Cpn) were cloned in such a way as to potentially obtain three 
different kind of proteins: 
10 a) proteins having an hexa-histidine tag at the C-terminus (cpn-His) 

b) proteins having a GST fusion partner at the N-terminus (Gst-cpn) 

c) proteins having both hexa-histidine tag at the C-terminus and GST at the N-terminus 
(GST/His fusion; NH 2 -GST-cpn-(His) 6 -COOH) 

The type a) proteins were obtained upon cloning in the pET21b+ (Novagen). The type b) and c) 
15 proteins were obtained upon cloning in modified pGEX-KG vectors [Guan & Dixon (1991) Anal 
Biochem. 192:262]. For instance pGEX-KG was modified to obtain pGEX-NN, then by modifying 
pGEX-NN to obtain pGEX-NNH. The Gst-cpn and Gst-cpn-His proteins were obtained in pGEX- 
NN and pGEX-NNH respectively. 

The modified versions of pGEX-KG vector were made with the aim of allowing the cloning of 
20 single amplification products in all three vectors after only one double restriction enzyme digestion 
and to minimise the presence of extraneous amino acids in the final recombinant proteins. 

(A) Construction of pGEX-NN and pGEX-NNH expression vectors 

Two couples of complementary oligodeoxyribonucleotides were synthesised using the DNA 
synthesiser ABI394 (Perkin Elmer) and the reagents from Cruachem (Glasgow, Scotland). Equimolar 
25 amounts of the oligo pairs (50 ng each oligo) were annealed in T4 DNA ligase buffer (New England 
Biolabs) for 10 min in a final volume of 50jxl and then were left to cool slowly at room temperature. 
With the described procedure he following DNA linkers were obtained: 

gexNN linker: 

Ndel Nhel Xmal EcoRI Ncol Sail Xhol Sad NotI 

3 0 GATCCCATATGGCTAGCCCGGGGAATTCGTCCATGGA^ 

GGTATACCGATCGGGCCCCTTAAGGAGGTACCTCACTCAGCTGACTGAGCTCA 

gexNNH linker: 

Hindlll NotI Xhol — Hexa-Histidine — 
35 TCGACAAGCTTGCGGCCGCACTCGAGCATCACC^TCACCATCACTGAT 

GTTCGAACGCCGGCGTGAGCACGTAiGAGGTAGTGGTAGTQACTATCGA 

The plasmid pGEX-KG was digested with BamHI and Hindin and 100 ng were ligated overnight at 
16 °C to the linker gexNN with a molar ratio of 3:1 linker/plasmid using 200 units of T4 DNA ligase 
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(New england Biolabs). After transformation of the ligation product in E. coli DH5, a clone 
containing the pGEX-NN plasmid, having the correct linker, was selected by means of restriction 
enzyme analysis and DNA sequencing. 

The new plasmid pGEX-NN was digested with Sail and HindHI and ligated to the linker gexNNH. 
5 After transformation of the ligation product in E. coli DH5, a clone containing the pGEX-NNH 
plasmid, having the correct linker, was selected by means of restriction enzyme analysis and DNA 
sequencing. 

(B) Chromosomal DNA preparation 

The chromosomal DNA of elementary bodies (EB) of C. pneumoniae strain 10L-207 was prepared by 
10 adding 1.5 ml of lysis buffer (10 mM Tris-HCl, 150 mMNaCl, 2 mM EDTA, 0,6 % SDS, 100 jig/ml 
Proteinase K, pH 8) to 450 pJ EB suspension (400.000/^1) and incubating overnight at 37 °C. After 
sequential extraction with phenol, phenol-chloroform, and chloroform, the DNA was precipitated 
with 0,3 M sodium acetate, pH 5,2 and 2 volumes of absolute ethanol. The DNA pellet was washed 
with 70 % ethanol. After solubilization with distilled water and treatment with 20 jig/ml RNAse A 
15 for 1 hour at RT, the DNA was extracted again with phenol-chloroform, alcohol precipitated and 
suspended with 300 pi 1 mM Tris-HCl pH 8,5. The DNA concentration was evaluated by measuring 
OD 260 of the sample. 

(C) Oligonucleotide design 

Synthetic oligonucleotide primers were designed on the basis of the coding sequence of each ORF 
20 using the sequence of ^pneumoniae strain CWL029. Any predicted signal peptide were omitted, by 
deducing the 5' end amplification primer sequence immediately downstream from the predicted 
leader sequence. For most ORFs, the 5' tail of the primers (table I) included only one restriction 
enzyme recognition site (Ndel, or Nhel, or Spel depending on the gene's own restriction pattern); the 
3' primer tails (tablel) included a Xhol or a NotI or a Hindm restriction site. 



5' tails 


3' tails 


Ndel 


5' GTGCGTCATATG 3* 


Xhol 


5' GCGTCTCGAG 3' 


Nhel 


5' GTGCGTGCTAGC 3' 


NotI 


5' ACTCGCTAGCGGCCGC 3' 


Spel 


5' GTGCGTACTAGT 3' 


Hindm 


5' GCGTAAGCTT 3' 



25 Table I. Oligonucleotide tails of the primers used to amplify Cpn genes. 

As well as containing the restriction enzyme recognition sequences, the primers included nucleotides 
which hybridized to the sequence to be amplified. The number of hybridizing nucleotides depended 
on the melting temperature of the primers which was determined as described [(Breslauer et al 
(1986) PNAS USA 83:3746-50]. The average melting temperature of the selected oligos was 50-55°C 
30 for the hybridizing region alone and 65-75°C for the whole oligos. Table II shows the forward and 
reverse primers used for each amplification. 



10 
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(D) Amplification 

The standard PCR protocol was as follow: 50 ng genomic DNA were used as template in the 
presence of 0,2 pM each primer, 200 pM each cTNTP, 1,5 mM MgCl 2 > lx PCR buffer minus Mg 
(Gibco-BRL), and 2 units of Taq DNA polymerase (Platinum Taq, Gibco-BRL) in a final volume of 
100 Each sample underwent a double-step amplification: the first 5 cycles were performed using 
as the hybridizing temperature the one of the oligos excluding the restriction enzyme tail, followed 
by 25 cycles performed according to the hybridization temperature of the whole lenght primers. The 
standard cycles were as follow: 

denaturation : 94 °C, 2 min 



denaturation: 94 °C, 30 seconds 

hybridization: 51 °C, 50 seconds | 5 cycles 

elongation: 72 °C, 1 min or 2 min and 40 sec 



} 



15 denaturation: 94 °C, 30 seconds 

hybridization: 70 °C, 50 seconds j 25 cycles 

elongation: 72 °C, 1 min or 2 min and 40 sec 

72 °C, 7 min 
20 4°C 

The elongation time was 1 min for ORFs shorter than 2000 bp, and 2 min and 40 seconds for ORFs 
longer than 2000 bp. The amplifications were performed using a Gene Amp PCR system 9600 
(Perkin Elmer), 

25 To check the amplification results, 4 |xl of each PCR product was loaded onto 1-1.5 agarose gel and 
the size of amplified fragments compared with DNA molecular weight standards (DNA markers HI 
or DC, Roche). The PCR products were loaded on agarose gel and after electrophoresis the right size 
bands were excised from the gel. The DNA was purified from the agarose using the Gel Extraction 
Kit (Qiagen) following the instruction of the manufacturer. The final elution volume of the DNA was 

30 50 pi TE (10 mM Tris-HCl, 1 mM EDTA, pH 8). One pi of each purified DNA was loaded onto 
agarose gel to evaluate the yield. 

(E) Digestion of PCR fragments 

One-two pg of purified PCR product were double digested overnight at 37 °C with the appropriate 
restriction enzymes (60 units of each enzyme) using the appropriate restriction buffer in 100 pJ final 
35 volume. The restriction enzymes and the digestion buffers were from New England Biolabs. After 
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purification of the digested DNA (PCR purification Kit, Qiagen) and elution with 30 |il TE, 1 |J was 
subjected to agarose gel electrophoresis to evaluate the yield in comparison to titrated molecular 
weight standards (DNA markers HI or IX, Roche). 

(F) Digestion of the cloning vectors (pET21b+, pGEX-NN, and pGEX-NNH) 
5 10 jig of plasmid was double digested with 100 units of each restriction enzyme in 400 ^1 reaction 
volume in the presence of appropriate buffer by overnight incubation at 37 °C. After electrophoresis 
on a 1% agarose gel, the band corresponding to the digested vector was purified from the gel using 
the Qiagen Qiaex II Gel Extraction Kit and the DNA was eluted with 50 pJ TE. The DNA 
concentration was evaluated by measuring OD260 of the sample. 

10 (G) Cloning 

75ng of the appropriately digested and purified vectors and the digested and purified fragments 
corresponding to each ORF, were ligated in final volumes of 10-20 jxl with a molar ratio of 1:1 
fragment/vector, using 400 units T4 DNA ligase (New England Biolabs) in the presence of the buffer 
supplied by the manufacturer. The reactions were incubated overnight at 16 °C. 

15 Transformation in E coli DH5 competent cells was performed as follow: the ligation reaction was 
mixed with 200 |xl of competent DH5 cells and incubated on ice for 30 min and then at 42 °C for 90 
seconds. After cooling on ice, 0.8 ml LB was added and the cells were incubated for 45 min at 37 °C 
under shaking. 100 and 900 \x\ of cell suspensions were plated on separate plates of agar LB 100 
jig/ml Ampicillin and the plates were incubated overnight at 37 °C. The screening of the 

20 transformants was done by growing randomly chosen clones in 6 ml LB 100 ng/ml Ampicillin, by 
extracting the DNA using the Qiagen Qiaprep Spin Miniprep Kit following the manufacturer 
instructions, and by digesting 2 ^1 of plasmid minipreparation with the restriction enzymes specific 
for the restriction cloning sites. After agarose gel electrophoresis of the digested plasmid mini- 
preparations, positive clones were chosen on the basis of the correct size of the restriction fragments, 

25 as evaluated by comparison with appropriate molecular weight markers (DNA markers HI or IX, 
Roche). 

(H) Expression 

1 pJ of each right plasmid mini-preparation was transformed in 200 pi of competent E. coli strain 
suitable for expression of the recombinant protein. All pET21b-f recombinant plasmids were 

30 transformed in BL21 DE3 (Novagen) E. coli cells, whilst all pGEX-NN and all pGEX-NNH 
recombinant plasmids were transformed in BL21 cells (Novagen). After plating transformation 
mixtures on LB/Amp agar plates and incubation overnight at 37 °C, single colonies were inoculated 
in 3 ml LB 100 jig/ml Ampicillin and grown at 37 °C overnight. 70 \jd of the overnight culture was 
inoculated in 2 ml LB/Amp and grown at 37 °C until OD600 of the pET clones reached the 0,4-0,8 

35 value or until ODeoo of the pGEX clones reached the 0,8-1 value. Protein expression was then 
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induced by adding IPTG (Isopropil (3-D thio-galacto-piranoside) to the mini-cultures. pET clones 
were induced using 1 mM IPTG, whilst pGEX clones were induced using 0.2 mM IPTG. After 3 
hours incubation at 37 °C the final OD600 was checked and the cultures were cooled on ice. After 
centrifugation of 0.5 ml culture, the cell pellet was suspended in 50 of protein Loading Sample 
5 Buffer (60 mM TRIS-HC1 pH 6.8, 5% w/v SDS, 10% v/v glycerin, 0.1% w/v Bromophenol Blue, 
100 mM DTT) and incubated at 100 °C for 5 min. A volume of boiled sample corresponding to 0.1 
OD 6 oo culture was analysed by SDS-PAGE and Coomassie Blue staining to verify the presence of 
induced protein band. 

PURIFICATION OF THE RECOMBINANT PROTEINS 

10 Single colonies were inoculated in 25 ml LB 100 jig/ml Ampicillin and grown at 37 °C overnight. 
The overnight culture was inoculated in 500 ml LB/Amp and grown under shaking at 25 °C until 
OD$oo 0,4-0,8 value for the pET clones, or until OD600 0,8-1 value for the pGEX clones. Protein 
expression was then induced by adding DPTG to the cultures. pET clones were induced using 1 mM 
IPTG, whilst pGEX clones were induced using 0.2 mM IPTG. After 4 hours incubation at 25 °C the 

15 final ODgoo was checked and the cultures were cooled on ice. After centrifugation at 6000 rpm (JA10 
rotor, Beckman), the cell pellet was processed for purification or frozen at -20 °C. 

(I) Procedure for the purification of soluble His-tagged proteins from E.coli 

1. Transfer the pellets from -20°C to ice bath and reconstitute with 10 ml 50 mM NaHP04 buffer, 
300 mM NaCl, pH 8,0, pass in 40-50 ml centrifugation tubes and break the cells as per the 

20 following outline: 

2. Break the pellets in the French Press performing three passages with in-line washing. 

3. Centrifuge at about 30-40000 x g per 15-20 min. If possible use rotor JA 25.50 (21000 rpm, 15 
min.) or JA-20 (18000 rpm, 15 min.) 

4. Equilibrate the Poly-Prep columns with 1 ml Fast Flow Chelating Sepharose resin with 50 mM 
25 phosphate buffer, 300 mM NaCl, pH 8,0. 

5. Store the centrifugation pellet at -20°C, and load the supernatant in the columns. 

6. Collect the flow through. 

7. Wash the columns with 10 ml (2 ml + 2 ml + 4 ml) 50 mM phosphate buffer, 300 mM NaCl, pH 
8,0. 

30 8. Wash again with 10 ml 20 mM imidazole buffer, 50 mM phosphate, 300 mM NaCl, pH 8,0. 

9. Elute the proteins bound to the columns with 4,5 ml (1,5 ml + 1,5 ml + 1,5 ml) 250 mM 
imidazole buffer, 50 mM phosphate, 300 mM NaCl, pH 8,0 and collect the 3 corresponding 
fractions of -1,5 ml each. Add to each tube 15 pi DTT 200 mM (final concentration 2 mM) 
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10. Measure the protein concentration of the first two fractions with the Bradford method, collect a 
10 jig aliquot of proteins from each sample and analyse by SDS-PAGE. (N.B.: should the sample 
be too diluted, load 21 \xl + 7 jul loading buffer). 

11. Store the collected fractions at +4°C while waiting for the results of the SDS-PAGE analysis. 

5 12. For immunisation prepare 4-5 aliquots of 100 pg each in 0,5 nil in 40% glycerol. The dilution 
buffer is the above elution buffer, plus 2 mM DTT. Store the aliquots at -20°C until 
immunisation. 

(J) Purification of His-tagged proteins from Inclusion bodies 

Purifications were carried out essentially according the following protocol: 

10 1. Bacteria are collected from 500 ml cultures by centrifugation. If required store bacterial pellets at 
-20°C. For extraction, resuspend each bacterial pellet in 10 ml 50 mM TRIS-HC1 buffer, pH 8,5 
on an ice bath. 

2. Disrupt the resuspended bacteria with a French Press, performing two passages. 

3. Centrifuge at 35000 x g for 15 min and collect the pellets. Use a Beckman rotor JA 25.50 (21000 
15 rpm, 15 min.) or JA-20 (18000 rpm, 15 min.). 

4. Dissolve the centrifugation pellets with 50 mM TRIS-HC1, 1 mM TCEP {Tris(2-carboxyethyl)- 
phosphine hydrochloride, Pierce) , 6M guanidium chloride, pH 8,5. Stir for ~ 10 min. with a 
magnetic bar. 

5. Centrifuge as described above, and collect the supernatant.. 

20 6. Prepare an adequate number of Poly-Prep (Bio-Rad) columns containing 1 ml of Fast Flow 
Chelating Sepharose (Pharmacia) saturated with Nichel according to manufacturer 
recommendations.. Wash the columns twice with 5 ml of H2O and equilibrate with 50 mM TRIS- 
HC1, 1 mM TCEP, 6M guanidinium chloride, pH 8,5. 

7. Load the supernatants from step 5 onto the columns, and wash with 5 ml of 50 mM TRIS-Hcl 
25 buffer, 1 mM TCEP, 6M urea, pH 8,5 

8. Wash the columns with 10 ml of 20 mM imidazole, 50 mM TRIS-HC1 , 6M urea, 1 mM TCEP, 
pH 8,5. Collect and set aside the first 5 ml for possible further controls. 

9. Elute the proteins bound to the columns with 4,5 ml of a buffer containing 250 mM imidazole, 50 
mM TRIS-HC1, 6M urea, 1 mM TCEP, pH 8,5. Add the elution buffer in three 1,5 ml aliquots, 

30 and collect the corresponding 3 fractions. Add to each fraction 15 \xl DTT (final concentration 2 

mM) . 

10. Measure eluted protein concentration with the Bradford method, and analyze aliquots of ca 10 fig 
of protein by SDS-PAGE. 

11. Store proteins at -20°C in 40% (v/v) glycerol, 50 mM TRIS-HC1, 2M urea, 0.5 M arginine, 2 mM 
35 DTT, 0.3 mM TCEP, 83.3 mM imidazole, pH 8,5 
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(K) Procedure for the purification of GST-fusion proteins from E.coli 

1. Transfer the bacterial pellets from -20°C to an ice bath and resuspend with 7,5 ml PBS, pH 7,4 
to which a mixture of protease inhibitors (C0MPLETE™ - Boehringer Mannheim, 1 tablet every 
25 ml of buffer) has been added. Transfer to 40-50 ml centrifugation tubes and sonicate 

5 according to the following procedure: 

a) Position the probe at about 0,5 cm from the bottom of the tube 

b) Block the tube with the clamp 

c) Dip the tube in an ice bath 

d) Set the sonicator as follows: Timer -> Hold, Duty Cycle ~> 55, Out. Control — > 6. 

10 e) perform 5 cycles of 10 impulses at a time lapse of 1 minute (i.e. one cycle = 10 impulses 

+ ~45" hold; b. 10 impulses + -45" hold; c. 10 impulses + -45" hold; d. 10 impulses + 
-45" hold; e. 10 impulses + -45" hold) 

2. Centrifuge at about 30-40000 x g for 15-20 min. E.g.: use rotor Beckman JA 25.50 at 21000 
rpm, for 15 min. 

15 3. Store the centrifugation pellets at -20°C, and load the supernatants on the chromatography 
columns, as follows 

4. Equilibrate the Poly-Prep (Bio-Rad) columns with 0,5 ml (=1 ml suspension) of Glutathione- 
Sepharose 4B resin, wash with 2 ml (1 + 1) H 2 0, and then with 10 ml (2 + 4 + 4) PBS, pH 7,4. 

5. Load the supernatants on the columns and discard the flow through. 
20 6. Wash the columns with 10 ml (2 + 4 + 4) PBS, pH 7,4. 

7. Elute the proteins bound to the columns with 4,5 ml of 50 mM TRIS buffer, 10 raM reduced 
glutathione, pH 8.0, adding 1,5 ml + 1,5 ml + 1,5 ml and collecting the respective 3 fractions of 
-1,5 ml each. 

8. Measure the protein concentration of the first two fractions with the Bradford method, analyse a 
25 10 yg aliquot of proteins from each sample by SDS-PAGE. (N.B.: if the sample is too diluted 

load 21 |il (+ 7 jil loading buffer). 

9. Store the collected fractions at +4°C while waiting for the results of the SDS-PAGE analysis. 

10. For each protein destined to the immunisation prepare 4-5 aliquots of 100 jig each in 0,5 ml of 
40% glycerol. The dilution buffer is 50 mM TRIS.HC1, 2 mM DTT, pH 8,0. Store the aliquots at 

30 -20°C until immunisation.. 

SEROLOGY 

(L) Protocol of immunization 

1. Groups of four CD1 female mice aged between 6 and 7 weeks were immunized with 20 jig of 
recombinant protein resuspended in 100 pj. 



* 
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2. Four mice for each group received 3 doses with a 14 days interval schedule. 

3. Immunization was performed through intra-peritoneal injection of the protein with an equal 
volume of Complete Freund's Adjuvant (CFA) for the first dose and Incomplete Freund's Adjuvant 
(IF A) for the following two doses. 

5 4. Sera were collected before each immunization. Mice were sacrified 14 days after the third 
immunization and the collected sera were pooled and stored at -20°C. 

(M) Western blot analysis of Cpn elementary body proteins with mouse sera 

Aliquots of elementary bodies containing approximately 4 of proteins, mixed with SDS loading 
buffer (lx: 60 mM TRIS-HC1 pH 6.8, 5% w/v SDS, 10% v/v glycerin, 0.1% Bromophenol Blue, 100 

10 mM DTT) and boiled 5 minutes at 95° C, were loaded on a 12% SDS-PAGE gel. The gel was run 
using a SDS-PAGE running buffer containing 250 mM TRIS, 2.5 mM Glycine and 0.1 %SDS. The 
gel was electroblotted onto nitrocellulose membrane at 200 mA for 30 minutes. The membrane was 
blocked for 30 minutes with PBS, 3% skimmed milk powder and incubated O/N at 4° C with the 
appropriate dilution (1/100) of the sera. After washing twice with PBS + 0.1% Tween (Sigma) the 

15 membrane was incubated for 2 hours with peroxidase-conjugated secondary anti-mouse antibody 
(Sigma) diluted 1:3000. The nitrocellulose was washed twice for 10 minutes with PBS + 0.1% 
Tween-20 and once with PBS and thereafter developed by Opri-4CN Substrate Kit (Biorad). 

Lanes shown in Western blots are: (P) = pre-immune control serum; (I) = immune serum. 

(N) FACS analysis of Chlamydia pneumoniae elementary bodies with mouse sera 
20 1. 2xl0 5 Elementary Bodies (EB)/well were washed with 200 pi of PBS-0.1%BSA in a 96 wells U 
bottom plate and centrifiiged for 10 min. at 1200rpm, at 4°C. 

2. The supernatant was discarded and the E.B. resuspended in 1 0 pJ of PBS-0. 1 %BSA. 

3. lO^il mouse sera diluted in PBS-0.1%BSA were added to the E.B. suspention to a final dilution 
of 1 :400, and incubated on ice for 30 min. 

25 4. EB were washed by adding 1 80^il PB S-0. 1 %BSA and centrifiiged for 1 Omin. at 1 200rpm, 4°C. 

5. The supernatant was discarded and the E.B. resuspended in 10 1 of PBS-0.1%BSA. 

6. 10p.l of a goat anti-mouse IgG, F(ab')2 fragment specific-R-Pbycoerythrin-conjugated (Jackson 
Immunoresearch Laboratories Inc., cat.N°l 15-1 16-072) was added to the EB suspension to a 
final dilution of 1:100, and incubated on ice for 30 min. in the dark. 

30 7. EB were washed by adding 180^1 PBS-0.1%BSA and centrifuged for lOmin. at 1200rpm, 4°C. 

8. The supernatant was discarded and the E.B. resuspended in 150 |xl of PBS-0.1%BSA. 

9. E.B. suspension was passed through a cytometric chamber of a FACS Calibur (Becton Dikinson, 
Mountain View, CA USA) and 10.000 events were acquired. 
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10. Data were analysed using Cell Quest Software (Becton Dikinson, Mountain View, CA USA) by 
drawing a morphological dot plot (using forward and side scatter parameters) on E.B. signals. An 
histogram plot was then created on FL2 intensity of fluorescence log scale recalling the 
morphological region of EB. 

5 NB: the results of FACS depend not only on the extent of accessibility of the native antigens but also 
on the quality of the antibodies elicited by the recombinant antigens, which may have structures with 
a variable degree of correct folding as compared with the native protein structures. Therefore, even if 
a FACS assay appears negative this does not necessarily mean that the protein is not abundant or 
accessible on the surface. PorB antigen, for instance, gave negative results in FACS but is a surface- 
10 exposed neutralising antigen [Kubo & Stephens (2000) Mol Microbiol 38:772-780]. 

(O) Mass Spectrometry analysis of two-dimensional electrophoretic protein maps 

Gradient purified EBs from strain FB/96 were solubilized at a final concentration of 5.5mg/ml with 
immobiline rehydratation buffer (7M urea, 2M thiourea, 2% (w/v) CHAPS, 2% (w/v) ASB 14 
[Chevallet et al (1998) Electrophor. 19:1901-9], 2% (v/v) CA 3-10NL (Amersham Pharmacia 

15 Biotech), 2 mM tributyl phosphine, 65 mM DTT). Samples (250|jg protein) were adsorbed overnight 
on Immobiline DryS trips (7 cm, pH 3-10 non linear). Electrophocusing was performed in a DPGphor 
Isoelectric Focusing Unit (Amersham Pharmacia Biotech). Before PAGE separation, the focused 
strips were incubated in 4M urea, 2M thiourea, 30% (v/v) glycerol, 2% (w/v) SDS, 5mM tributyl 
phosphine 2.5%(w/v) acrylamide, 50mM Tris-HCl pH 8.8, as described [Herbert et al (1998) 

20 Electrophor. 19:845-51]. SDS-PAGE was performed on linear 9-16% acrylamide gradients. Gels 
were stained with colloidal Coomassie (Novex, San Diego) [Doherty et al (1998) Electroplior. 
19:355-63]. Stained gels were scanned with a Personal Densitometer SI (Molecular Dynamics) at 8 
bits and 50jum per pixel. Map images were annotated with the software Image Master 2D Elite, 
version 3.10 (Amersham Pharmacia Biotech). Protein spots were excised from the gel, using an Ettan 

25 Spot picker (Amersham Pharmacia Biotech), and dried in a vacuum centrifuge. In-gel digestion of 
samples for mass spectrometry and extraction of peptides were performed as described by Wilm et 
al [Nature (1996) 379:466-9]. Samples were desalted with a ZIP TIP (Millipore), eluted with a 
saturated solution of alpha-cyano-4-hydroxycinnamic acid in 50% acetonitrile, 0.1% TFA and 
directly loaded onto a SCOUT 381 multiprobe plate (Bruker). Spectra were acquired on a Bruker 

30 Biflex II MALDI-TOF. Spectra were calibrated using a combination of known standard peptides, 
located in spots adjacent to the samples. Resulting values for monoisotopic peaks were used for 
database searches using the computer program Mascot (www.matrixscience.com). All searches were 
performed using an error of 200-500ppm as constraint. A representative gel is shown in Figure 190. 

Example 1 

35 The following C.pnewnoniae protein (pid 4376552) was expressed <SEQ ID 1; cp6552>: 

1 MKKKLSIXVG LIFVLSSCHK EDAQNKIRIV ASPTPHAELL ESLQEEAKDL 
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51 GIKLKILPVD DYRIPNRLLL DKQVDANYFQ HQAFLDDECE RYDCKGEliW 

101 IAKVHLEPQA IYSKKHSSLE RJUKSQKKLTI AIPVDRTNAQ RAIiHT.T.K K CG 

151 LIVCKGPANL NMTAKDVCGK ENRSINILEV SAPLLVGSLP DVDAAVIPGN 

201 FAIAANLSPK KDSLCLEDLS VSKYTNLWI RSEDVGSPKM IKLQKLFQSP 
251 SVQHFFDTKY HGNILTMTQD NG* 

A predicted signal peptide is highlighted. 



The cp6552 nucleotide sequence <SEQ ID 2> is: 



1 ATGAAAAAAA AATTATCATT ACTTGTAGGT TTAATTTTTG TTTTGAGTTC 

51 TTGCCATAAG GAAGATGCTC AGAATAAAAT ACGTATTGTA GCCAGTCCGA 

101 CACCTCATGC GGAATTATTG GAGAGTTTAC AGGAAGAGGC TAAAGATCTT 

151 GGAATCAAGC TGAAAATACT TCCAGTAGAT GATTATCGTA TTCCTAATCG 

201 TTTGCTTTTG GATAAACAAG TAG ATG CAAA TTACTTTCAA CATCAAGCTT 

251 TTCTTGATGA CGAATGCGAG CGTTATGATT GTAAGGGTGA ATTAGTTGTT 

301 ATCGCTAAAG TTCATTTGGA ACCTCAAGCA ATTTATTCTA AGAAACATTC 

351 TTCTTTAGAG CGCTTAAAAA GCCAGAAGAA ACTGACTATA GCGATTCCTG 

401 TGGATCGTAC GAATGCTCAG CGTGCTCTAC ACTTGTTAGA AGAGTGCGGA 

451 CTCATTGTTT GCAAAGGGCC TGCTAATTTA AATATGACAG CTAAAGATGT 

501 CTGTGGGAAA GAAAATAGAA GTATCAACAT ATTAGAGGTG TCAGCTCCTC 

551 TTCTTGTCGG ATCTCTTCCT GACGTTGATG CTGCTGTCAT TCCTGGAAAT 

601 TTTGCTATAG CAGCAAACCT TTCTCCAAAG AAAGATAGTC TTTGTTTAGA 

651 GGATCTTTCG GTATCTAAGT ATACAAACCT TGTTGTCATT CGTTCTGAAG 

701 ACGTAGGTTC TCCTAAAATG ATAAAATTAC AGAAGCTGTT TCAATCTCCT 

751 TCTGTACAAC ATTTTTTTGA TACAAAATAT CATGGGAATA TTTTGACAAT 

801 GACTCAAGAC AATGGTTAG 

The PSORT algorithm predicts an inner membrane location (0.127). 

The protein was expressed in E.coli and purified as a his-tag product, as shown in Figure 1A, and 
also as a GST-fusion. The recombinant protein was used to immunise mice, whose sera were used in 
a Western blot (Figure IB) and for FACS analysis (Figure 1C). 

The cp6552 protein was also identified in the 2D-PAGE experiment (Cpn0278). 

These experiments show that cp6552 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 



Example 2 

The following C.pneumoniae protein (pid 437673 6) was expressed <SEQ ID 3; cp6736>: 

1 MKTSIRKFLI STTLAPCFAS T AFT VB V IMP SENFDGSSGK IFPYTTIjSDP 

51 RGTLCIFSGD LYIANLDNAI SRTSSSCFSN RAGALQ I LGK GGVFSFLNIR 

101 SSADGAAISS VITQNPELCP LSFSGFSQMI FDNCESLTSD TSASNVXPHA 

151 SAI YATT PML FTNNDSILFQ YNRSAGFGAA IRGTSITIEN TKKSLLFNGN 

201 GSISNGGALT GSAAINLINN SAPVIFSTNA TGIYGGAIYL TGGSMLTSGN 

251 Li SGVTjFVNNS SRSGGAIYAN GNVTFSNNSD LTFQNNTASP QNSLPAPTPP 

301 PTPPAVTPLL GYGGAIFCTP PATPPPTGVS LTISGENSVT FLENIASEQG 

351 GALYGKKISI DSNKSTIFLG NTAGKGGAIA IPESGELSLS ANQGDILFNK 

401 NLSITSGTPT RNSIHFGKDA KFATLGATQG YTLYFYDPIT SDDLSAASAA 

451 ATVWNPKAS ADGAYSGTIV FSGETLTATE AATPANAT ST LNQKLELEGG 

501 TliALRNGATL NVHNFTQDEK SWIMDAGTT LATTNGANNT DGAITLNKLV 

551 INIiDSLDGTK AAWNVQSTN G ALT I SGTLG LVKNSQDCCD NHGMFNKDLQ 

601 Q VP I LELKAT SNTVTTTDFS LGTNGYQQSP YGYQGTWEFT IDTTTHTVTG 

651 NWKKTGYLPH PERLAPLIPN S LWANVT DLR AVSQASAADG EDVPGKQLSI 

701 TGITNFFHAN HTGDARSYRH MGGGYLINTY TRITPDAALS LGFGQLFTKS 

751 KDYLVGHGHS NVYFATVYSN ITKSLFGSSR FFSGGTSRVT YSRSNEKVKT 

801 SYTKLPKGRC SWSNNCWLGE LEGNLPITLS SRILNLKQII PFVKAEVAYA 

851 THGGIQENTP EGRIFGHGHL LNVAVPVGVR FGKNSHNRPD FYTIXVAYAP 

901 DVYRHNPDCD TTLPINGATW TSIGNNLTRS TLLVQASSHT S VNDVLrE I FG 

951 HCGCDIRRTS RQYTLDIGSK LRF* 

A predicted signal peptide is highlighted. 
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The cp6736 nucleotide sequence <SEQ ID 4> is: 

1 ATGAAAACGT CTATTCGTAA GTTCTTAATT TCTACCACAC TGGCGCCATG 

51 TTTTGCTTCA ACAGCGTTTA CTGTAGAAGT TATCATGCCT TCCGAGAACT 

101 TTGATGGATC GAGTGGGAAG ATTTTTCCTT ACACAACACT TTCTGATCCT 

151 AGAGGGACAC TCTGTATTTT TTCAGGGGAT CTCTACATTG CGAATCTTGA 

201 TAATGCCATA TCCAGAACCT CTTCCAGTTG CTTTAGCAAT AGGGCGGGAG 

251 CACTACAAAT CTTAGGAAAA GGTGGGGTTT TCTCCTTCTT AAATATCCGT 

301 TCTTCAGCTG ACGGAGCCGC GATTAGTAGT GTAATCACCC AAAATCC TGA 

351 ACTATGTCCC TTGAGTTTTT CAGGATTTAG TCAGATGATC TTCGATAACT 

401 GTGAATCTTT GACTTCAGAT ACCTCAGCGA GTAATGTCAT ACCTCACGCA 

451 TCGGCGATTT ACGCTACAAC GCCCATGCTC TTTACAAACA ATGACTCCAT 

501 ACTATTCCAA TACAACCGTT CTGCAGGATT TGGAGCTGCC ATTCGAGGCA 

551 CAAGCATCAC AATAGAAAAT ACGAAAAAGA GCCTTCTCTT TAATGGTAAT 

601 GGATCCATCT CTAATGGAGG GGCCCTCACG GGATCTGCAG CGATCAACCT 

651 CATCAACAAT AGCGCTCCTG TGATTTTCTC AACGAATGCT ACAGGGATCT 

701 ATGGTGGGGC TATTTACCTT ACCGGAGGAT CTATGCTCAC CTCTGGGAAC 

751 CTCTCAGGAG TCTTGTTCGT TAATAATAGC TCGCGCTCAG GAGGCGCTAT 

801 CTATGCTAAC GGAAATGTCA CATTTTCTAA TAACAGCGAC CTGACTTTCC 

851 AAAACAATAC AGCATCTCCA CAAAACTCCT TACCTGCACC TACACCTCCA 

901 CCTACACCAC CAGCAGTCAC TCCTTTGTTA GGATATGGAG GCGCCATCTT 

951 CTGTACTCCT CCAGCTACCC CCCCACCAAC AGGTGTTAGC CTGACTATAT 

1001 CTGGAGAAAA CAGCGTTACA TTCCTAGAAA AC ATTGC CTC CGAACAAGGA 

1051 GGAGCCCTCT ATGGCAAAAA GATCTCTATA GATTCTAATA AATCTACAAT 

1101 ATTTCTTGGA AATACAGCTG GAAAAGGAGG CGCTATTGCT ATTCCCGAAT 

1151 CTGGGGAGCT CTCTCTATCC GCAAATCAAG GTGATATCCT CTTTAACAAG 

1201 AACCTCAGCA TCACTAGTGG GACACCTACT CGCAATAGTA TTCACTTCGG 

1251 AAAAGATGCC AAGTTTGCCA CTCTAGGAGC TACGCAAGGC TAT AC C CT AT 

13 01 ACTTCTATGA TC CG ATT AC A TCTGATGATT TATCTGCTGC ATCCGCAGCC 

1351 GCTACTGTGG TCGTCAATCC CAAAGCCAGT GCAGATGGTG CGTATTCAGG 

1401 GACTATTGTC TTTTCAGGAG AAACCCTCAC TGCTACCGAA GCAGCAACCC 

1451 CTGCAAATGC TACATCTACA TTAAACCAAA AGCTAGAACT TGAAGGCGGT 

1501 AC TCTCGCTT TAAGAAACGG TGCTACCTTA AATGTTCATA ACTTCACGCA 

1551 AGATGAAAAG TCCGTCGTCA TCATGGATGC AGGGACCACA TTAGCAACTA 

1601 CAAATGGAGC TAATAATACT GACGGTGCTA TCACCTTAAA CAAGCTTGTA 

1651 ATCAATCTGG ATTCTTTGGA TGGCACTAAA GCGGCTGTCG TTAATGTGCA 

1701 GAGTACCAAT GGAGCTCTCA CTATATCCGG AACTTTAGGA CTTGTGAAAA 

1751 ACTCTCAAGA TTGCTGTGAC AACCACGGGA TGTTTAATAA AGATTTACAG 

1801 CAAGTTCCGA TTTTAGAACT CAAAGCGACT TCAAATACTG TAACCACTAC 

1851 GGACTTCAGT CTCGGCACAA ACGGCTATCA GCAATCTCCC TATGGGTATC 

1901 AAGGAACTTG GGAGTTTACC ATAGACACGA CAACCCATAC GGTCACAGGA 

1951 AATTGGAAAA AAAC CGGTTA TCTTCCTCAT CCGGAGCGTC TTGCTCCCCT 

2001 CATTCCTAAT AGCCTATGGG CAAACGTCAT AGATTTACGA GCTGTAAGTC 

2051 AAGCGTCAGC AGCTGATGGC GAAGATGTCC CTGGGAAGCA ACTGAGCATC 

2101 ACAGGAATTA CAAATTTCTT CCATGCGAAT CATACCGGTG ATGCACGCAG 

2151 CTACCGCCAT ATGGGTGGAG GCTACCTCAT CAATACCTAC ACACGCATCA 

2201 CTCCAGATGC TGCGTTAAGT CTAGGTTTTG GACAGCTGTT TACAAAATCT 

2251 AAGGATTACC TCGTAGGTCA CGGTCATTCT AACGTTTATT TCGCTACAGT 

2301 ATACTCTAAC ATCACCAAGT CTCTGTTTGG ATCATCGAGA TTCTTCTCAG 

2351 GAGGCACTTC TCGAGTTACC TATAGCCGTA GCAATGAGAA AGTAAAGACT 

2401 TCATATACAA AATTGCCTAA AGGGCGCTGC TCTTGGAGTA ACAATTGCTG 

2451 GTTAGGAGAA CTCGAAGGGA ACCTTCCCAT CACTCTCTCT TCTCGCATCT 

2501 TAAACCTCAA GCAGATCATT CCCTTTGTAA AAGCTGAAGT TGCTTACGCG 

2551 ACTCATGGGG GCATCCAAGA AAATACCCCC GAGGGGAGGA TTTTTGGACA 

2601 CGGTCATCTA CTCAACGTTG CAGTTCCCGT AGGCGTCCGC TTTGGTAAAA 

2651 ATTCTCATAA TCGACCAGAT TTTTACACTA TAATCGTAGC CTATGCTCCT 

2701 GATGTCTATC GTCACAATCC TGATTGCGAT ACGACATTAC CTATTAATGG 

2751 AGCTACGTGG ACCTCTATAG GGAATAATCT AACCAGAAGT ACTTTGCTAG 

2801 TACAAGCATC CAGCCATACT TCAGTAAATG ATGTTCTAGA GATCTTCGGG 

2851 CACTGTGGAT GTGATATTCG CAGAACCTCC CGTCAATATA CTCTAGATAT 

2901 AGGAAGCAAA TTACGATTTT AA 

The PSORT algorithm predicts an outer membrane location (0.917). 
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The protein was expressed in Kcoli and purified as a his- tag product, as shown in Figure 2 A, and 
also as a GST-fusion. Both proteins were used to immunise mice, whose sera were used in a Western 
blot (Figure 2B) and for FACS analysis (Figure 2C). 

The cp6736 protein was also identified in the 2D-PAGE experiment (Cpn0453) and showed good 
cross-reactivity with human sera, including sera from patients with pneumonitis. 

These experiments show that cp6736 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 3 

The following C.pneumoniae protein (pid 4376751) was expressed <SEQ ID 5; cp6751>: 

1 MRFFC FGMLL PFTFVIAN EG I1QI1PLETYIT LSPEYQAAPQ VGFTHNQNQD 

51 LAIVGNHNDF ILDYKYYRSN GGALTCKNLL ISENIGNVFF EKNVC PNSGG 

101 AIYAAQNCTI SKNQNYAFTT NLVSDNPTAT AGSLLGGALF AINCSITNNL 

151 GQGTFVDNIiA LNKGGALYTE TNLSIKDNKG PIIIKQNRAL NSDSLGGGIY 

201 SGNSLNIEGN SGAIQITSNS SGSGGGIFST QTLTISSNKK LIEISENSAF 

251 ANNYGSNFNP GGGGLTTTFC TILNNREGVL FNNNQSQSNG GAIHAKSIII 

301 KENGPVYFLN NTATRGGALL NLSAGSGNGS FILSADNGDI IFNNNTASKH 

351 ALNPPYRNAI HSTPNMNLQI GARPGYRVLF YDPIEHELPS SFPILFNFET 

401 GHTGTVLF SG EHVHQNFTDE MNFFSYLRNT SELRQGVLAV EDGAGLACYK 

451 FFQRGGTLLL GQGAV ITT AG TIPTPSSTPT TVGSTITUSIH IAIDIiPSILS 

501 FQAQAPKIWI YPTKTGSTYT EDSNPTITIS GTIiTLRNSNN EDPYDSLDLS 

551 HSLEKVPLLY I VDVAAQK IN SSQLDLSTLN SGEHYGYQGI WSTYWVETTT 

601 ITNPTSLLGA NTKHKLLYAN WSPLGYRPHP ERRGEFITNA LWQSAYTALA 

651 GLHSLSSWDE EKGHAASLQG IGLLVHQKDK NGFKGFRSHM TGYSATTEAT 

701 SSQSPNFSLG FAQFFSKAKE HESQNSTSSH HYFSGMCIEN TLFKEWIRLS 

751 VSIiAYMFTSE HTHTMYQGLL EGNSQGSFHN HTLAGALSCV FLPQPHGESL 

801 QIYPFITALA IRGNLAAFQE SGDHAREFSLi HRPIiTDVSLP VGIRASWKNH 

851 HRVPLVWLTE ISYRSTLYRQ DPELHSKLLI SQGTWTTQAT PVTYNALGIK 

901 VKNTMQVFPK VTLSLDYSAD ISSSTLSHYL NVASRMRF* 

A predicted signal peptide is highlighted. 



The cp6751 nucleotide sequence <SEQ ID 6> is: 



1 ATGCGCTTTT TTTGCTTCGG AATGTTGCTT CCTTTTACTT TTGTATTGGC 

51 TAATGAAGGT CTCCAACTTC CTTTGGAGAC CTATATTACA TTAAGTCCTG 

101 AATATCAAGC AGCCCCTCAA GTAGGGTTTA CTCATAACCA AAATCAAGAT 

151 CTCGCAATTG TCGGGAATCA CAATGATTTC ATCTTGGACT ATAAGTACTA 

201 TCGGTCGAAT GGAGGTGCTC TTACCTGTAA GAATCTTCTG ATCTCTGAAA 

251 ATATAGGGAA TGTCTTCTTT GAGAAGAATG TCTGTCCCAA TTCTGGCGGG 

301 GCAATTTATG CTGCTCAAAA TTGCACGATC TCCAAGAATC AGAACTATGC 

351 ATTTACTACA AACTTGGTCT CTGACAATCC TACAGCCACT GCGGGATCAC 

401 TATTGGGTGG AGCTCTCTTT GCCATAAATT GCTCTATTAC TAATAACCTA 

451 GGACAGGGAA CTTTCGTTGA CAATCTCGCT TTAAATAAGG GGGGTGCCCT 

501 CTATACTGAG ACGAACTTAT CTATTAAAGA CAATAAAGGC CCGATCATAA 

551 TCAAGCAGAA TCGGGCACTA AATTCGGACA GTTTAGGAGG AGGGATTTAT 

601 AGTGGGAACT CTCTAAATAT AGAGGGAAAT TCTGGAGCTA TACAGATCAC 

651 AAGCAACTCT TCAGGATCTG GGGGAGGCAT ATTTTCTACC CAAACACTCA 

701 CGATCTCCTC GAATAAAAAA CTCATAGAAA TCAGTGAAAA TTCCGCGTTC 

751 GCAAATAACT ATGGATCGAA CTTCAATCCA GGAGGAGGAG GTCTTACTAC 

801 CACCTTTTGC ACGATATTGA ACAACCGAGA AGGGGTACTC TTTAACAATA 

851 ACCAAAGCCA GAGCAACGGT GGAGCCATTC ATGCGAAATC TATCATTATC 

901 AAAGAAAATG GTCCTGTATA CTTTTTAAAT AACACTGCAA CTCGGGGAGG 

951 GGCTCTCCTC AACTTATCAG CAGGTTCTGG AAACGGAAGC TTCATCTTAT 

1001 CTGCAGATAA TGGAGATATT ATCTTTAACA ATAATACGGC CTCCAAGCAT 

1051 GCCCTCAATC C TC CAT AC AG AAACGCCATT CACTCGACTC CTAATATGAA 

1101 TCTGCAAATA GGAGCCCGTC CCGGCTATCG AGTGCTGTTC TATGATCCCA 

1151 TAGAACATGA GCTCCCTTCC TCCTTCCCCA TACTCTTTAA TTTCGAAACC 

1201 GGTCATACAG GT AC AG TTTT ATTTTCAGGG GAACATGTAC ACCAGAACTT 
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1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 
2051 
2101 
2151 
2201 
2251 
2301 
2351 
2401 
2451 
2501 
2551 
2601 
2651 
2701 
2751 
2801 



TACCGATGAA 
GTCAAGGAGT 
TTCTTCCAAC 
GACAGCAGGA 
GTACTATAAC 
TTTCAAGCTC 
TACCTATACT 
CCTTACGCAA 
CACTCTCTTG 
AAAAATTAAC 
ACTATGGGTA 
ATCACGAACC 
CTATGCAAAC 
GAGAATTCAT 
GGACTCCACT 
CCTACAAGGC 
AGGGATTTCG 
TCTTC TC AAA 
AGCTAAAGAA 
CTGGAATGTG 
GTGTCTCTTG 
GGGTCTCCTG 
CAGGGGCTCT 
CAGATCTATC 
GTTTCAAGAA 
TAACGGACGT 
CACCGAGTTC 
CTATAGGCAA 
CGTGGACGAC 
GTGAAAAATA 
CTCTGCGGAT 
GTAGAATGAG 



ATGAATTTCT 
CCTTGCTGTT 
GAGGAGGCAC 
ACGATTCCCA 
TTTAAATCAC 
AGGCTCCAAA 
GAAGATTCCA 
CAGCAACAAC 
AGAAAGTTCC 
TCTTCGCAAC 
TCAAGGCATC 
CTACATCTCT 
TGGTCTCCTC 
TACGAATGCC 
CCCTCTCCTC 
ATTGGTCTTC 
TAGTCATATG 
GTCCGAATTT 
CATGAATCTC 
CATAGAAAAT 
CTTATATGTT 
GAAGGGAACT 
CTCCTGTGTT 
CCTTTATTAC 
TCTGGAGACC 
CTCCCTCCCT 
CCCTAGTCTG 
GATCCTGAAC 
GCAGGCCACT 
CCATGCAGGT 
ATTTCTTCCT 
ATTTTAA 



TTTCCTATTT 
GAAGATGGTG 
TCTACTTCTA 
CACCATCCTC 
ATTGCCATTG 
AATTTGGATT 
ACCCGACAAT 
GAAGATCCCT 
CCTTCTTTAT 
TGGATCTATC 
TGGTCGACCT 
ACTAGGCGCG 
TAGGCT AC CG 
TTGTGGCAAT 
CTGGGATGAA 
TGGTTCATCA 
ACAGGTTATA 
CTCTTTAGGA 
AAAATAGCAC 
ACTCTCTTCA 
TACCTCGGAA 
CTCAGGGATC 
TTCTTACCTC 
TGCCTTAGCC 
ATGCTCGGGA 
GTAGGAATCC 
GCTCACAGAA 
TCCACTCGAA 
CCTGTGACCT 
GTTTC CTAAA 
CCACGCTGAG 



AAGG AAC ACT 
CGGGGCTGGC 
GGTCAAGGTG 
AACACCAACG 
ACCTTCCTTC 
TACCCCACAA 
CACAATCTCA 
ACGATAGTCT 
ATTGTCGATG 
CACATTAAAT 
ATTGGGTAGA 
AATACAAAAC 
TCCTCATCCC 
CGGCATATAC 
GAGAAGGGTC 
AAAAGACAAA 
GTGCTACCAC 
TTTGCTCAGT 
GTCCTCTCAC 
AAGAGTGGAT 
CATACCCATA 
TTTCCACAAC 
AACCTCACGG 
ATCCGAGGAA 
ATTTTCCCTA 
GCGCTTCTTG 
ATTTCCTATC 
ATTACTGATT 
ACAATGCTTT 
GTCACTCTCT 
TCACTACTTA 



TCGGAACTAC 
CTGCTATAAG 
CGGTGATCAC 
ACAGTAGGAA 
TATTCTTTCT 
AAACAGGATC 
GGAACTCTCA 
GGATCTC TCG 
TCGCTGCACA 
TCTGGCGAAC 
AACTACAACA 
'ACAAGCTGCT 
GAACGTCGAG 
GGCTCTTGCA 
ATGCAGCTTC 
AACGGTTTTA 
CGAAGCAACC 
TCTTCTCCAA 
CACTATTTCT 
ACGTCTATCT 
CAATGTATCA 
CATACCTTAG 
CGAGTC CCTG 
ATCTTGCTGC 
CACCGCCCCC 
GAAGAACCAC 
GCTCTACTCT 
AGCCAAGGTA 
AGGGATCAAA 
CCTTAGATTA 
AACGTGGCGA 



The PSORT algorithm predicts an outer membrane location (0.923). 

The protein was expressed in Kcoli and purified as a GST-fusion product, as shown in Figure 3 A, 
and also in his-tagged form. The GST-fusion recombinant protein was used to immunise mice, whose 
sera were used in a Western blot (Figure 3B) and for FACS analysis (Figure 3C). 

This protein also showed good cross-reactivity with human sera, including sera from patients with 
pneumonitis. 

These experiments show that cp6751 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 



Example 4 

The following C.pneumoniae protein (pid 4376752) was expressed <SEQ ID 7; cp6752>: 

1 MFGMT PAVYS LQTDSLEKFA LERDEEFRTS FPLLDSLSTL TGFSPITTFV 

51 GNRHNSSQDI VLSNYKSIDN ILLL.WTSAGG AVSCNNFLLS NVEDHAFFSK 

101 NLAIGTGGAI ACQGACTITK NRGPLIFFSN RGL.NNASTGG ETRGGAIACN 

151 GDFTISQNQG TFYFVNNSVN NWGGALSTNG KCRIQSNRAP LLFFNNTAPS 

201 GGGALRSENT TISDNTRPIY FKNNCGNNGG AIQTSVTVAI KNNSGSVIFN 

251 NNTALSGSIN SGNGSGGAIY TTNLSIDDNP GTILFNNNYC IRDGGAICTQ 

301 FLTIKNSGHV YFTNNQGNWG GALHLLQDST CLIjFAEQGNI AFQNNEVFLT 

351 TFGRYNAIHC TPNSNLQLGA NKGYTTAFFD PIEHQHPTTN PLIFNPNANH 

401 QGTILFSSAY IPEASDYENN FISSSKNTSE IiRNGVL S I ED RAGWQFYKFT 

451 QKGGIliKliGH AASIATTANS ETPSTSVGSQ VIINNLAINL PSILAKGKAP 

501 TLWIRPLQSS APFTEDNNPT ITLSGPLTLL NEENRDPYDS IDLSEPLQNI 

551 HLIiSLSDVTA RHINTDNFHP ESLNATEHYG YQGIWSPYW ETITTTNNAS 

601 IETANTLYRA LYANWTPLGY KVNPEYQGDL ATTPLWQSFH TMFSLLRSYN 

651 RTGDSDIERP FLEIQGIADG LFVHQNSIPG APGFRIQSTG YSLQASSETS 
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701 LHQKISLGFA QFFTRTKBIG SSNNVSAHNT VSSLYVELPW FQEAFATSTV 

751 LAYGYGDHHL HSLHPSHQEQ AEGTCYSHTL AAAIGCSFPW QQKSYLHLSP 

801 FVQAIAIRSH QTAFEEIGDN PRKFVSQKPF YNIiTLPLGIQ GKWQSKFHVP 

851 TEWTLELSYQ PVLYQQNPQI GVTLLASGGS WDILGHNYVR NALGYKVHNQ 

901 TALFRSLDLF LDYQGSVSSS TSTHHLQAGS TliKF* 



The cp6752 nucleotide sequence <SEQ ID 8> is: 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



1 


ATGTTCGGGA 


51 


AAAGTTTGCT 


101 


TAGACTCTCT 


151 


GGAAATAGAC 


201 


TATTGATAAC 


251 


GTAATAATTT 


301 


AATCTCGCGA 


351 


AATCACGAAG 


401 


ACAATGCGAG 


451 


GGAGACTTCA 


501 


TTCCGTCAAC 
* a v.* *w w x 


551 


tccaa anr A A 


601 




651 


TCCTATTTAT 


701 


V,rtftOv^U X X 


751 


AACAACAPAR 


801 




851 


x x x x vnn x .r-u-i 


901 


fvprprnm/^ A P A A 

X X X X l\3ft.l„ArL 


951 




1001 




1051 


ACAT'PTGGTA 


1101 


ACTTGGAGCT 


1151 


APCAACATPP 


1201 


CAGGGAAPGA 


1251 


CGAAAATAAT 


1301 


GTGTCCTCTC 


1351 


CAAAAAGGAG 


1401 


TGCCAACTCT 


1451 


ATAACCTTGC 


1501 


ACCTTGTGGA 


1551 


TAACCCTACA 


1601 


ACCGCGATCC 


1651 


C ATCTTC TTT 


1701 


CTTTCATCCT 


1751 


TCTGGTCTCC 


1801 


ATAGAGACGG 


1851 


CTTAGGATAT 


1901 


CCCTATGGCA 


1951 


CGAACTGGTG 


2001 


TGCCGACGGC 


2051 


TCCGTATCCA 


2101 


TTACATCAGA 


2151 


AGAAATCGGA 


2201 


TTTATGTTGA 


2251 


TTAGCGTATG 


2301 


TCAAGAACAG 


2351 


TCGGCTGTTC 


2401 


TTCGTTCAGG 


2451 


TGGTGACAAT 


2501 


CCTTACCTCT 


2551 


ACAGAATGGA 


2601 


TCCCCAAATC 


2651 


TAGGCCATAA 


2701 


ACTGCGCTCT 


2751 


CTCCTCCTCG 


2801 


TCTAA 



TGACTCCTGC 
TTAGAGAGGG 
CTCCACTCTT 
ATAATTCCTC 
ATCCTTCTTC 
CTTATTATCA 
TTGGGACTGG 
AATAGAGGAC 
TACAGGAGGA 
CGATTTCTCA 
AACTGGGGAG 
CAGGGCACCT 
CGCTTCGTAG 
TTTAAGAACA 
TGTTGCGATA 
CGTTATCTGG 
ACAACAAACC 
TAACTACTGC 
TCAAAAATAG 
GGTGCTCTTA 
AGGAAATATC 
GATACAACGC 
AATAAGGGGT 
AACTACAAAT 
TCTTATTTTC 
TTCATTAGCA 
TATCGAGGAT 
GTATCCTTAA 
GAGACTCCAT 
GATTAACCTC 
TCCGTCCTCT 
ATTACTTTAT 
CTACGACAGT 
CTTTATCGGA 
GAAAGCTTAA 
TTATTGGG T A 
CAAACACCCT 
AAGGTCAATC 
ATCCTTTCAT 
ATTCTGATAT 
CTCTTTGTTC 
ATCTACAGGG 
AAATCTCCTT 
TCAAGCAACA 
GCTTCCGTGG 
GCTATGGGGA 
GCAGAAGGGA 
TTTCCCTTGG 
CAATTGCAAT 
CCCCGAAAGT 
AGGAATCCAA 
CTCTAGAACT 
GGTGTC AC GC 
CTATGTTCGC 
TCCGTTCTCT 
ACATCTACGC 



AGTGTATAGT 
ATGAAGAGTT 
ACAGGATTTT 
TCAAGACATT 
TTTGGACATC 
AATGTTGAAG 
AGGCGCGATT 
CCCTTATTTT 
GAAACTCGTG 
AAATCAAGGG 
GAGCCCTCTC 
CTACTC TTTT 
TGAAAATACA 
ACTGTGGGAA 
AAAAATAACT 
TTCGATAAAT 
TATCCATAGA 
ATTCGCGATG 
TGGCCACGTA 
TGCTCCTACA 
GCATTTCAAA 
CATACATTGT 
ATACGACTGC 
CCTCTAATCT 
TTCAGCCTAT 
GC TCGAAAAA 
CGTGCGGGAT 
ATTAGGGCAT 
CAACTAGTGT 
CCCTCGATCT 
ACAATCTAGT 
CAGGTCCTCT 
ATAGATCTCT 
TGTAACAGCA 
ATGCGACTGA 
GAGACGATAA 
CTACAGAGCT 
C TGAATACC A 
ACTATGTTCT 
CGAGAGGCCT 
ATCAAAATAG 
TATTCCTTAC 
AGGTTTTGCA 
AC GT CTCGGC 
TTCCAAGAGG 
CCATCACCTC 
CGTGTTATAG 
CAACAGAAAT 
ACGTTCTCAC 
TTGTCTCTCA 
GGAAAATGGC 
TTCTTACCAA 
TACTTGCGAG 
AATGCTTTAG 
CGATCTATTC 
ACCATCTCCA 



TTACAAACGG 
TCGTACGAGC 
CTCCAATAAC 
GTACTTTCTA 
GGCTGGGGGA 
ACCATGCCTT 
GC TTGCC AGG 
TTTCAGCAAT 
GGGGTGCGAT 
ACTTTCTACT 
CACCAATGGA 
TTAACAATAC 
ACGATCTCTG 
CAATGGCGGG 
CCGGGTCGGT 
TCAGGAAATG 
CGATAACCCT 
GCGGAGCTAT 
TATTTCACCA 
GGACAGCACC 
ATAATGAGGT 
ACACCAAATA 
TTTTTTTGAT 
TTAATCCCAA 
ATCCCAGAAG 
TACCTCTGAA 
GGCAATTCTA 
GCGGCGAGTA 
AGGCTCCCAG 
TAGCAAAAGG 
GCTCCTTTCA 
GACACTCTTA 
CTGAGCCTTT 
CGTCATATCA 
GCATTACGGT 
CAACAACAAA 
CTGTATGCCA 
AGGAGATCTT 
CTCTATTAAG 
TTCTTAGAAA 
CATCCCCGGG 
AAGCATCCTC 
CAGTTCTTCA 
TCACAATACA 
CCTTTGCAAC 
CACAGCCTAC 
CCATACATTA 
C CTATCTTC A 
CAAACAGCGT 
AAAGCCTTTC 
AGTCAAAATT 
CCGGTACTCT 
CGGAGGTTCC 
GGTACAAAGT 
TTGGATTACC 
AGCAGGAAGT 



ACTCCCTTGA 
TTTCCTCTCT 
TACGTTTGTT 
ACTACAAGTC 
GCTGTGTCCT 
CTTCAGTAAA 
GAGCCTGCAC 
CGAGGTCTTA 
TGCCTGTAAT 
TTGTCAACAA 
CACTGCCGCA 
AGCCCCTAGT 
ATAACACGCG 
GC C ATT C AAA 
GATTTTCAAT 
GTTCAGGAGG 
GGAACTATTC 
CTGTACACAA 
ACAATCAAGG 
TGCCTACTCT 
TTTCCTCACC 
GCAACTTACA 
CCTATAGAAC 
TGCGAACCAT 
CTTCTGACTA 
CTTCGCAATG 
TAAGTTC AC T 
TTGCAACAAC 
GTCATCATTA 
AAAAGCTCCT 
CAGAGGACAA 
AATGAGGAAA 
ACAAAACATT 
ATACCGATAA 
TATCAAGGCA 
TAACGCTTCT 
ATTGGACTCC 
GCTACGACTC 
AAGTTATAAT 
TTCAAGGGAT 
GCTCCAGGAT 
CGAAACTTCT 
CCCGCACTAA 
GTCTCTTCAC 
ATCCACAGTG 
ATCCCTCACA 
GCAGCAGCTA 
CCTCAGCCCG 
TCGAAGAGAT 
TATAATCTGA 
CCACGTACCT 
ATCAACAAAA 
TGGGATATCC 
CCACAATCAA 
AAGGATCGGT 
ACCTTAAAAT 



The PSORT algorithm predicts a cytoplasmic location (0.138). 
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The protein was expressed in E.coli and purified as a his-tag product, as shown in Figure 4 A, and 
also as a GST-fusion. The recombinant proteins were used to immunise mice, whose sera were used 
in a Western blot (4B) and the his-tagged protein was used for FACS analysis (4C). 

The cp6752 protein was also identified in the 2D-PAGE experiment (Cpn0467). 

These experiments show that cp6752 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 5 

The following C.pneumoniae protein (pid 4376850) was expressed <SEQ ID 9; cp6850>: 

1 MKKAVLIAAM FCGWSLSSC CRIVDCCFED PCAPSSCNPC EVIRKKERSC 
51 GGNACGSYVP SCSNPCGSTE CNSQSPQVKG CTSPDGRCKQ * 

A predicted signal peptide is highlighted. 

The cp6850 nucleotide sequence <SEQ ID 10> is: 

1 ATGAAGAAAG CTGTTTTAAT TGCTGCAATG TTTTGTGGAG TAGTTAGCTT 

51 AAGTAGCTGC TGCCGCATTG TAGATTGTTG TTTTGAGGAT CCTTGCGCAC 

101 CCTCTTCTTG CAATCCTTGT GAAGTAATAA GAAAAAAAGA AAGATCTTGC 

151 GGCGGTAATG CTTGTGGGTC CTACGTTCCT TCTTGTTCTA ATCCATGTGG 

201 TTCAACAGAG TGTAACTCTC AAAGCCCACA AGTTAAAGGT TGTACATCAC 

251 CTGATGGCAG ATGCAAACAG TAA 

The PSORT algorithm predicts an inner membrane location (0.329). 

The protein was expressed in E.coli and purified as a GST-fusion product, as shown in Figure 5 A. 
The recombinant protein was used to immunise mice, whose sera were used in a Western blot 
(Figure 5B) and for FACS analysis (Figure 5B). A his-tagged protein was also expressed. 

These experiments show that cp6850 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 6 

The following C.pneumoniae protein (pid 43 76900) was expressed <SEQ ID 11; cp6900>: 

1 MKIKFSWKVN FLICLLAVGli 1FFGCSRVKR EVLiVGRDATW FPKQFGIYTS 

51 DTNAFLNDLV SEINYKENLN INIVNQDWVH LFENLDDKKT QGAFTSVLPT 

101 liEMLEHYQFS DPILLTGFVL WAQDSPYQS IEDLKGRX.IG VYKFDSSVLV 

151 AQNIPDAVIS LYQHVPIALE ALTSNCYDAL LAFVIEVTAL IETAYKGRLK 

201 IISKPLNADG LRLAIL.KGTN GDLLEGFNAG LVKTRRSGKY DAIKQRYRLP 

The cp6900 nucleotide sequence <SEQ ID 12> is: 

1 GTGAAGATAA AATTTTCTTG GAAGGTAAAT T TTTTAAT AT GTTTACTGGC 

51 TGTGGGACTG ATCTTTTTCG GGTGCTCTCG AGTAAAAAGA GAAGTTCTCG 

101 TAGGTCGTGA TGCCACCTGG TTTCCAAAAC AATTCGGCAT TTATACATCC 

151 GATACCAACG CATTTTTAAA CGATCTTGTT TCTGAGATTA ACTATAAAGA 

201 GAATCTAAAT ATTAATATTG TAAATCAAGA TTGGGTGCAT CTCTTTGAGA 

251 ATTTAGATGA TAAAAAGACC CAAGGAGCAT TTACATCTGT ATTGCCTACT 

301 CTTGAGATGC TCGAACACTA TCAATTTTCT GATCCCATTT TACTCACAGG 

351 TCCTGTCCTT GTCGTCGCTC AAGACTCTCC TTACCAATCT ATAGAGGATC 

401 TTAAAGGTCG TCTTATTGGA GTGTATAAGT TTGACTCTTC AGTTCTTGTA 

451 GCTCAAAATA TCCCTGACGC TGTGATTAGC CTCTACCAAC ATGTTCCAAT 

501 AGCATTGGAA GC C TTAAC AT CGAATTGTTA CGACGCTCTT CTAGCTCCTG 

551 TAATTGAAGT GACCGCGCTA ATAGAAACAG CATATAAAGG AAGACTGAAA 

601 ATTATTTCAA AACCCTTAAA CGCAGATGGT TTGCGGCTTG CAATACTGAA 
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651 AGGGACAAAC GGAGATTTGC TTGAAGGGTT TAACGCAGGA CTTGTGAAAA 
701 CACGACGCTC AGGAAAATAC GATGCTATAA AACAGCGGTA TCGTCTTCCC 
751 TAA 

The PSORT algorithm predicts an inner membrane location (0.452). 

The protein was expressed in E.coli and purified as a GST-fusion product, as shown in Figure 6A. 
The recombinant protein was used to immunise mice, whose sera were used for FACS analysis 
(Figure 6B). A his-tagged protein was also expressed. 

The cp6900 protein was also identified in the 2D-PAGE experiment (Cpn0604). 

These experiments show that cp6900 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 7 

The following C.pneumoniae protein (pid 4377033) was expressed <SEQ ED 13; cp7033>: 

1 MVNPIGPGPI DETERTPPAD LSAQGLEASA ANKSAEAQRI AGAEAKPKES 

51 KTDSVERWSI LRSAVNALMS IaADKLG I AS S NSSSSTSRSA DVDSTTATAP 

101 TPPPPTFDDY KTQAQTAYDT IFTSTSLADI QAALVSLQDA VTNIKDTAAT 

151 DEETAIAAEW ETKNADAVKV GAQI TEIjAK Y ASDNQAILDS IiGKLTSFDLL 

201 QAALLQ S VAN NNKAAELLKE MQDNPWPGK TPAIAQSLVD QTDATATQIE 

251 KDGNAIRDAY FAGQNASGAV ENAKSNNSIS NIDSAKAAIA TAKTQIAEAQ 

301 KKFPDSPILQ EAEQMVTQAE KDLKN I K PAD GSDVPNPGTT VGGSKQQGSS 

351 IGSIRVSMLL DDAENET AS I 1MSGFRQMIH MFNTENPDSQ AAQQELAAQA 

401 RAAKAAGDDS AAAALADAQK ALEAALGKAG QQQGILNALG QIASAAWSA 

451 GVPPAAASSI GSSVKQLYKT SKSTGSDYKT QISAGYDAYK SINDAYGRAR 

501 NDATRDVINN VSTPALTRSV PRARTEARGP EKTDQAliARV ISGNSRTLGD 

551 VYSQVSALQS VMQIIQSNPQ ANNEE I RQKL TSAVTKPPQF GYPYVQLSND 

601 STQKFIAKLE SIiFAEGSRTA AEIKALSFET NSIiFIQQVW NIGSLYSGYL 

651 Q* 



031 

The cp7033 nucleotide sequence <SEQ ID 14> is: 



1 ATGGTTAATC CTATTGGTCC AGGTCCTATA GACGAAACAG AACGCACACC 

51 TCCCGCAGAT CTTTCTGCTC AAGGATTGGA GGCGAGTGCA GCAAATAAGA 

101 GTGCGGAAGC TCAAAGAATA GCAGGTGCGG AAGCTAAGCC TAAAGAATCT 

151 AAGACCGATT CTGTAGAGCG ATGGAGCATC TTGCGTTCTG CAGTGAATGC 

201 TCTCATGAGT CTGGCAGATA AGCTGGGTAT TGCTTCTAGT AACAGCTCGT 

251 CTTCTACTAG CAGATCTGCA GACGTGGACT CAACGACAGC GACCGCACCT 

301 ACGCCTCCTC CACCCACGTT TGATGATTAT AAGACTCAAG CGCAAACAGC 

351 TTACGATACT ATCTTTACCT CAACATCACT AGCTGACATA CAGGCTGCTT 

401 TGGTGAGCCT CCAGGATGCT GTCACTAATA TAAAGGATAC AGCGGCTACT 

451 GATGAGGAAA CCGCAATCGC TGCGGAGTGG GAAACTAAGA ATGCCGATGC 

501 AGTTAAAGTT GGCGCGCAAA TTAC AG AATT AGCGAAATAT GCTTCGGATA 

551 ACCAAGCGAT TCTTGACTCT TTAGGTAAAC TGACTTCCTT CGACCTCTTA 

601 CAGGCTGCTC TTCTCCAATC TGTAGCAAAC AATAACAAAG CAGCTGAGCT 

651 TCTTAAAGAG ATGCAAGATA ACCCAGTAGT CCCAGGGAAA ACGCCTGCAA 

701 TTGCTCAATC TTTAGTTGAT CAGACAGATG CTACAGCGAC ACAGATAGAG 

751 AAAGATGGAA ATGCGATTAG GGATGCATAT TTTGCAGGAC AGAACGCTAG 

801 TGGAGCTGTA GAAAATGCTA AATCTAATAA CAGTATAAGC AACATAGATT 

851 CAGCTAAAGC AGCAATCGCT ACTGCTAAGA CACAAATAGC TGAAGCTCAG 

901 AAAAAGTTCC CCGACTCTCC AATTCTTCAA GAAGCGGAAC AAATGGTAAT 

951 ACAGGCTGAG AAAGATCTTA AAAATATCAA ACCTGCAGAT GGTTCTGATG 

1001 TTCCAAATCC AGGAACTACA GTTGGAGGCT CCAAGCAACA AGGAAGTAGT 

1051 ATTGGTAGTA TTCGTGTTTC CATGCTGTTA GATGATGCTG AAAATGAGAC 

1101 CGCTTCCATT TTGATGTCTG GGTTTCGTCA GATGATTCAC ATGTTCAATA 

1151 CGGAAAATCC TGATTCTCAA GCTGCCCAAC AGGAGCTCGC AGCACAAGCT 

12 01 AGAGCAGCGA AAGCCGCTGG AGATG AC AG T GCTGCTGCAG CGCTGGCAGA 

1251 TGCTCAGAAA GCTTTAGAAG CGGCTCTAGG TAAAGCTGGG CAACAACAGG 

1301 GCATACTCAA TGCTTTAGGA CAGATCGCTT CTGCTGCTGT TGTGAGCGCA 

1351 GGAGTTCCTC CCGCTGCAGC AAGTTCTATA GGGTCATCTG TAAAACAGCT 

1401 TTACAAGACC TCAAAATCTA CAGGTTCTGA TTATAAAACA CAGATATCAG 
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1451 CAGGTTATGA TGCTTACAAA TCCATCAATG ATGCCTATGG TAGGG CACGA 
1501 AATGATGCGA CTCGTGATGT GATAAACAAT GTAAGTACCC CCGCTCTCAC 
1551 ACGATCCGTT CCTAGAGCAC GAACAGAAGC TCGAGGACCA GAAAAAACAG 
1601 ATCAAGCCCT CGCTAGGGTG ATTTCTGGCA ATAGCAGAAC TCTTGGAGAT 
1651 GTCTATAGTC AAGTTTCGGC ACTACAATCT GTAATGCAGA TCATCCAGTC 
1701 GAATCCTCAA GCGAATAATG AGGAGATCAG ACAAAAGCTT ACATCGGCAG 
1751 TGACAAAGCC TCCACAGTTT GGCTATCCTT ATGTGCAACT TTCTAATGAC 
1801 \ TCTACACAGA AGTTCATAGC TAAATTAGAA AGTTTGTTTG CTGAAGGATC 
1851 TAGGACAGCA GCTGAAATAA AAGCACTTTC CTTTGAAACG AACTCCTTGT 
1901 T T ATT C AG C A GGTGCTGGTC AATATCGGCT CTCTATATTC TGGTTATCTC 
1951 CAATAA 

The PSORT algorithm predicts a cytoplasmic location (0.272). 

The protein was expressed in Kcoli and purified as a GST-fusion product, as shown in Figure 7 A. A 
his-tagged protein was also expressed. The recombinant proteins were used to immunise mice, whose 
sera were used for FACS (Figure 7B) and Western blot (7C) analyses. 

The cp7033 protein was also identified in the 2D-PAGE experiment (Cpn0728) and showed good 
cross-reactivity with human sera, including sera from patients with pneumonitis. 

These experiments show that cp7033 a surface-exposed and immunoaccessible protein, and that it is 
. a useful immunogen. These properties are not evident from the sequence alone. 

Example 8 

The following C.pneumoniae protein (pid 6172321) was expressed <SEQ ID 15; cp0017>: 

1 MGIKGTGIIV WVDDATAKTK NATLTWTKTG YKPNPERQGP LVPNSLWGSF 

51 VDVRSIQSLM DRSTSSLSSS TNLWVSGIAD FLHEDQKGNQ RSYRHSSAGY 

101 ALGGGFFTAS ENFFNFAFCQ LFGYDKDHLV AKNHTHVYAG AMSYRHLGES 

151 KTIiAKILSGN SDSLPFVFNA RFAYGHTDNN MTTKYTGYSP VKGSWGNDAF 

201 GIECGGAIPV VASGRRSWVD THTPFUSTLEM IYAHQNDFKE NGTEGRSFQS 

251 EDLFNLAVPV GIKFEKFSDK STYDLSIAYV PDVIRNDPGC TTTLMVSGDS 

301 WSTCGTSLSR QAIiLVRAGNH HAFASNFEVF SQFEVEliRGS SRSYAIDLGG 

351 RFGF* 

The cp0017 nucleotide sequence <SEQ ID 16> is: 

1 ATGGGTATCA AGGGAACTGG AATAATTGTT TGGGTCGACG ATGCAACTGC 

51 AAAAACAAAA AATGCTAC CT TAACTTGGAC TAAAACAGGA TACAAGCCGA 

101 ATCCAGAACG TCAGGGACCT TTGGTTCCTA ATAGCCTGTG GGGTTCTTTT 

151 GTCGATGTCC GCTCCATTCA GAGCCTCATG GACCGGAGCA CAAGTTCGTT 

201 ATCTTCGTCA ACAAATTTGT GGGTATCAGG AATCGCGGAC TTTTTGCATG 

251 AAG AT CAGAA AGGAAACCAA CGTAGTTATC GTCATTCTAG CGCGGGTTAT 

301 GCATTAGGAG GAGGATTCTT CACGGCTTCT GAAAATTTCT TTAATTTTGC 

351 TTTTTGTCAG CTTTTTGGCT AC GACAAGG A CCATCTTGTG GCTAAGAACC 

401 ATACCCATGT ATATGCAGGG GCAATGAGTT ACCGACACCT CGGAGAGTCT 

451 AAGACCCTCG CTAAGATTTT GTCAGGAAAT TCTGACTCCC TACCTTTTGT 

501 CTTCAATGCT CGGTTTGCTT ATGGC CAT AC CGACAATAAC ATGACCACAA 

551 AGTACACTGG CTATTCTCCT GTTAAGGGAA GCTGGGGAAA TGATGCCTTC 

601 GGTATAGAAT GTGGAGGAGC TATCCCGGTA GTTGCTTCAG GACGTCGGTC 

651 TTGGGTGGAT ACCCACACGC CATTTCTAAA CCTAGAGATG ATCTATGCAC 

701 AT C AG AATGA CTTTAAGGAA AACGGCACAG AAGGCCGTTC TTTCCAAAGT 

751 GAAGACCTCT TCAATCTAGC GGTTCCTGTA GGGATAAAAT TTGAGAAATT 

801 CTCCGATAAG TCTACGTATG ATCTCTCCAT AGCTTACGTT CCCGATGTGA 

851 TTCGTAATGA TCCAGGCTGC ACGACAACTC TTATGGTTTC TGGGGATTCT 

901 TGGTCGACAT GTGGTACAAG CTTGTCTAGA CAAGCTCTTC TTGTACGTGC 

951 TGGAAATCAT CATGCCTTTG CTTCAAACTT TGAAGTTTTC AGTCAGTTTG 

1001 AAGTCGAGTT GCGAGGTTCT TCTCGTAGCT ATGCTATCGA TCTTGGAGGA 

1051 AGATTCGGAT TTTAA 

This sequence is frame-shifted with respect to cp0016. 
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The PSORT algorithm predicts a cytoplasmic location (0.075). 

The protein was expressed in E.coli and purified as a GST-fusion product, as shown in Figure 8 A. 
The recombinant protein was used to immunise mice, whose sera were used in a Western blot 
(Figure 8B) and for FACS analysis (Figure 8C). A his-tagged protein was also expressed. 

5 This protein also showed good cross-reactivity with human sera, including sera from patients with 
pneumonitis. 

These experiments show that cp0017 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 9 

10 The following C.pneumoniae protein (pid 6172315) was expressed <SEQ ID 17; cp0014>: 

1 MKSSFPKFVF STFAIFPL.SM IATETVLDSS ASFDGNKNGN FSVRESQEDA 

51 GTTYLFKGNV TLENI PGTGT AITKSCFNNT KGDLTFTGNG NSLLFQTVDA 

101 GTVAGAAVNS SWDKSTTFI GFSSLSFIAS PGSSITTGKG AVSCSTGSLS 

151 LTKMSVCSSA KTFQRIMAVL SPQKLFH* 

15 The cp0014 nucleotide sequence <SEQ ID 18> is: 

1 ATGAAGTCTT CTTTCCCCAA GTTTGTATTT TCTACATTTG CTATTTTCCC 

51 TTTGTCTATG ATTGCTACCG AGACAGTTTT GGATTCAAGT GCGAGTTTCG 

101 ATGGGAATAA AAATGGTAAT TTTTCAGTTC GTGAGAGTCA GGAAGATGCT 

151 GGAACTACCT ACCTATTTAA GGGAAATGTC ACTCTAGAAA ATATTCCTGG 

20 201 AACAGGCACA GCAATCACAA AAAGCTGTTT TAACAACACT AAGGGCGATT 

251 TGACTTTCAC AGGTAACGGG AACTCTC TAT TGTTCCAAAC GGTGGATGCA 

301 GGGACTGTAG CAGGGGCTGC TGTTAACAGC AGCGTGGTAG ATAAATCTAC 

351 CACGTTTATA GGGTTTTCTT CGCTATCTTT TATTGCGTCT CCTGGAAGTT 

401 CGATAACTAC CGGCAAAGGA GCCGTTAGCT GCTCT ACGGG TAGCTTGAGT 

25 451 TTGACAAAAA TGTCAGTTTG CTCTTCAGCA AAAACTTTTC AACGGATAAT 

501 GGCGGTGCTA TCACCGCAAA AACTCTTTCA TTAA 

This protein is frame-shifted with respect to cp0015. 

The PSORT algorithm predicts an inner membrane location (0.047). 

The protein was expressed in Rcoli and purified as a his-tag product, as shown in Figure 9 A. A 
30 GST-fusion was also expressed. The recombinant proteins were used to immunise mice, whose sera 
were used in an immunoassay (Figure 9B) and for FACS analysis (Figure 9C). 

This protein also showed good cross-reactivity with human sera, including sera from patients with 
pneumonitis. 

These experiments suggest that cp0014 is a useful immunogen. These properties are not evident from 
35 the sequence alone. 

Example 10 

The following C.pneumoniae protein (pid 6172317) was expressed <SEQ ID 19; cp0015>: 

1 MSALFSENTS SKKGGAIQTS DAIVTITGNQG EVSFSDNTSS DSGAAIFTEA 

51 SVTISNNAKV SFIDNKVTGA SSSTTGDMSG GAICAYKTST DTKVTLTGNQ 

40 101 MLLF SNNT ST T AGGA I YVKK LELASGGLTL FSRNSVNGGT APKGGAIAIE 

151 DSGELSLSAD SGDIVFDGNT VTSTTPGTNR SSIDLGTSAK MTALRSAAGR 
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201 AIYFYDPITT GSSTTVTDVL KVNETPADSA LQYTGNI I FT GEKLSETEAA 

251 DSKNLTSKLL QPVTLSGGTI, SLKHGVTLQT QAFTQQADSR LEMDVGTTLB 

301 PADTSTINNL VINISSIDGA KKAK I ETKAT SKNLTLSGTI TLLDPTGTFY 

351 ENHSLRNPQS YDILELKASG TVTSTAVTPD PIMGEKFHYG YQGTWGPIVW 

401 GTGASTTATF NWTKTGYIPN PERIGSLVPN SLWNAFIDIS SLHYIiMETAN 

451 EGLQGDRAFW CAGLSNFFHK DSTKTRRGFR HLSGGYVIGG NLHTCSDKIL 

501 SAAFCQLFGR DRDYFVAKNQ GTVYGGTLYY QHNETYISIjP CKliRPCSLSY 

551 VPTEIPVL.FS GNLSYTHTDN DLKTKYTTYP TVKGSWGNDS FALEFGGRAP 

601 ICLDESALFE QYMPFMKLQF VYAHQEGFKE QGTEAREFGS SRLVNLALPI 

651 GIRFDKESDC QDATYNLTLG YTVDLVRSNP DCTTTLRISG DSWKTFGTNL 

701 ARQAX*VL»RAG NHFCFNSNFE AFSQFSFELR GSSRNYNVDL GAKYQF* 

This sequence is frame-shifted with respect to cp0014. 
The cp0015 nucleotide sequence <SEQ ID 20> is: 

1 ATGTCAGCTC TGTTTTCTGA AAATACCTCC TCAAAGAAAG GCGGAGC CAT 

51 TCAGACTTCC GATGCCCTTA CCATTACTGG AAACCAAGGG GAAGTCTCTT 

101 TTTCTGACAA TACTTCTTCG GATTCTGGAG CTGCAATTTT TACAGAAGCC 

151 TCGGTGACTA TTTCTAATAA TGCTAAAGTT TCCTTTATTG ACAATAAGGT 

201 CACAGGAGCG AGCTCCTCAA CAACGGGGGA TATGTCAGGA GGTGCTATCT 

251 GTGCTTATAA AACTAGTACA GATACTAAGG TCACCCTCAC TGGAAATCAG 

301 ATGTTACTCT TCAGCAACAA TACATCGACA ACAGCGGGAG GAGCTATCTA 

351 TGTGAAAAAG CTCGAACTGG CTTCCGGAGG ACTTACCCTA TTCAGTAGAA 

401 ATAGTGTCAA TGGAGGTACA GCTCCTAAAG GTGGAGCCAT AGCTATCGAA 

451 GATAGTGGGG AATTGAGTTT ATCCGCCGAT AGTGGTGACA TTGTCTTTTT 

501 AGGGAATACA GTCACTTCTA CTACTCCTGG GACGAATAGA AGTAGTATCG 

551 ACTTAGGAAC GAGTGCAAAG ATGACAGCTT TGCGTTCTGC TGCTGGTAGA 

601 GCCATCTACT TCTATGATCC CATAACTACA GGATCATCCA CAACAGTTAC 

651 AGATGTCTTA AAAGTTAATG AGACTCCGGC AGATTCTGCA CTACAATATA 

701 CAGGGAACAT CATCTTCACA GGAGAAAAGT TATCAGAGAC AGAGGCCGCA 

751 GATTCTAAAA ATCTTACTTC GAAGCTACTA CAGCCTGTAA CTCTTTCAGG 

801 AGGTACTCTA TCTTTAAAAC ATGGAGTGAC TCTGCAGACT CAGGCATTCA 

851 CTCAACAGGC AGATTCTCGT CTCGAAATGG ACGTAGGAAC TACTCTAGAA 

901 CCTGCTGATA CTAGCACCAT AAACAATTTG GTCATTAACA TCAGTTCTAT 

951 AGACGGTGCA AAGAAGGCAA AAATAGAAAC CAAAGCTACG TCAAAAAATC 

1001 TGACTTTATC TGGAACCATC ACTTTATTGG ACCCGACGGG CACGTTTTAT 

1051 GAAAATCATA GTTTAAGAAA TCCTCAGTCC TACGACATCT TAGAGCTCAA 

1101 AGCTTCTGGA ACTGTAACAA GCACCGCAGT GACTCCAGAT CCTATAATGG 

1151 GTGAGAAATT CCATTACGGC TATCAGGGAA CTTGGGGCCC AATTGTTTGG 

1201 GGGACAGGGG CTTCTACGAC TGCAACCTTC AACTGGACTA AAACTGGCTA 

1251 TATTCCTAAT CCCGAGCGTA TCGGCTC TTT AGTCCCTAAT AGCTTATGGA 

1301 ATGCATTTAT AGATATTAGC TCTCTCCATT ATCTTATGGA GACTGCAAAC 

1351 GAAGGGTTGC AGGGAGACCG TGCTTTTTGG TGTGCTGGAT TATCTAACTT 

1401 CTTCCATAAG GATAGTACAA AAACACGACG CGGGTTTCGC CATTTGAGTG 

1451 GCGGTTATGT CATAGGAGGA AACCTACATA CTTGTTCAGA TAAGATTCTT 

1501 AGTGCTGCAT TTTGTCAGCT CTTTGGAAGA GATAGAGACT ACTTTGTAGC 

1551 TAAGAATCAA GGTACAGTCT ACGGAGGAAC TCTCTATTAC CAGCACAACG 

1601 AAACCTATAT CTCTCTTCCT TGCAAACTAC GGCCTTGTTC GTTGTCTTAT 

1651 GTTCCTACAG AGATTCCTGT TC TCTTTTC A GGAAACCTTA GCTACACCCA 

1701 TACGGATAAC GATCTGAAAA CCAAGTATAC AACATATCCT ACTGTTAAAG 

1751 GAAGCTGGGG GAATGATAGT TTCGCTTTAG AATTCGGTGG AAGAGCTCCG 

1801 ATTTGCTTAG ATGAAAGTGC TCTATTTGAG CAGTACATGC CCTTCATGAA 

1851 ATTGCAGTTT GTCTATGCAC ATCAGGAAGG TTTTAAAGAA CAGGGAACAG 

1901 AAGCTCGTGA ATTTGGAAGT AGCCGTCTTG TGAATCTTGC CTTACCTATC 

1951 GGGATCCGAT TTGATAAGGA ATCAGACTGC CAAGATGCAA CGTACAATCT 

2001 AACTCTTGGT TATACTGTGG ATCTTGTTCG TAGTAACCCC GACTGTACGA 

2051 CAACACTGCG AATTAGCGGT GATTCTTGGA AAACCTTCGG TACGAATTTG 

2101 GCAAGACAAG CTTTAGTCCT TCGTGCAGGG AACCATTTTT GCTTTAACTC 

2151 AAATTTTGAA GCCTTTAGCC AATTTTC TTT TGAATTGCGT GGGTCATCTC 

2201 GCAATTACAA TGTAGACTTA GGAGCAAAAT ACCAATTCTA A 

The PSORT algorithm predicts a cytoplasmic location (0.274). 

The protein was expressed in E.coli and purified as a GST-fusion product, as shown in Figure 10A. 
The recombinant protein was used to immunise mice, whose sera were used in a Western blot 
(Figure 10B) and for FACS analysis. A his-tagged protein was also expressed. 
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These experiments show that cpQ015 is a useful immunogen. These properties are not evident from 
the sequence alone. 

Example 11 

The following ^pneumoniae protein (pid 6172325) was expressed <SEQ ID 21; cp0019>: 

1 LQDSQDYSFV KLSPGAGGTI ITQDASQKPL EVAPSRPHYG YQGHWNVQVI 

51 PGTGTQPSQA NLEWVRTGYL PNPERQGSLV PNSLWGSFVD QRAIQEIMVN 

101 SSQXLCQERG VWGAGIANFL HRDKINEHGY RHSGVGYLVG VGTHAF S DAT 

151 INAAFCQLFS RDKDYWSKN HGTSYSGWF LEDTLEFRSP QGFYTDSSSE 

201 ACCNQWT ID MQLSYSHRNN DMKTKYTTYP EAQGSWANDV FGLEFGATTY 

251 YYPNSTFLFD YYSPFLRLQC TYAHQEDFKE TGGEVRHFTS GDLFNLAVPI 

301 GVKFERFSDC KRGSYELTLA YVPDVIRKDP K ST ATLAS GA TWSTHGNNLS 

351 RQGLQLRLGN HCLINPGIEV FSHGAIELRG SSRNYNINLG GKYRF* 

This sequence is frame-shifted with respect to cp0018. 
The cp0019 nucleotide sequence <SEQ ID 22> is: 

1 TTGCAAGACT CTCAAGACTA TAGCTTTGTA AAGTTATCTC CAGGAGCGGG 

51 AGGGACTATA ATTACTCAAG ATGCTTCTCA GAAGCCTCTT GAAGTAGCTC 

101 CTTCTAGACC ACATTATGGC TATCAAGGAC ATTGGAATGT GCAAGTCATC 

151 CCAGGAACGG GAACTCAACC GAGCCAGGCA AATTTAGAAT GGGTGCGGAC 

201 AGGATACCTT CCGAATCCCG AACGGCAAGG ATCTTTAGTT CCCAATAGCC 

251 TGTGGGGTTC TTTTGTTGAT CAGCGTGCTA TCCAAGAAAT CATGGTAAAT 

301 AGTAGCCAAA TCTTATGTCA GGAACGGGGA GTCTGGGGAG CTGGAATTGC 

351 TAATTTCCTA CATAGAGATA AAATTAATGA GCACGGCTAT CGCCATAGCG 

401 GTGTCGGTTA TCTTGTGGGA GTTGGCACTC ATGCTTTTTC TGATGCTACG 

451 ATAAATGCGG C TTTTTGC C A GCTCTTCAGT AGAGATAAAG ACTACGTAGT 

501 ATCCAAAAAT CATGGAACTA GCTACTCAGG GGTCGTATTT CTTGAGGATA 

551 CCCTAGAGTT TAGAAGTCCA CAGGGATTCT ATACTGATAG CTCCTCAGAA 

601 GCTTGCTGTA ACCAAGTCGT CACTATAGAT ATGCAGTTGT CTTACAGCCA 

651 TAGAAATAAT GATATGAAAA CCAAATACAC GACATATCCA GAAGCTCAGG 

701 GATCTTGGGC AAATGATGTT TTTGGTCTTG AGTTTGGAGC GACTACATAC 

751 TACTACCCTA ACAGTACTTT TTTATTTGAT TACTACTCTC CGTTTCTCAG 

801 GCTGCAGTGC ACCTATGCTC ACCAGGAAGA CTTCAAAGAG ACAGGAGGTG 

851 AGGTTCGTCA CTTTACTAGC GGAGATCTTT TCAATTTAGC AGTTCCTATT 

901 GGCGTGAAGT TTGAGAGATT TTCAGACTGT AAAAGGGGAT CTTATGAACT 

951 TACCCTTGCT TATGTTCCTG ATGTGATTCG CAAAGATCCC AAGAGCACGG 

1001 CAACATTGGC TAGTGGAGCT ACGTGGAGCA CCCACGGAAA CAATCTCTCC 

1051 AGACAAGGAT TACAACTGCG TTTAGGGAAC CACTGTCTCA TAAATCC TGG 

1101 AATTGAGGTG TTCAGTCACG GAGCTATTGA ATTGCGGGGA TCCTCTCGTA 

1151 ATTATAACAT CAATCTCGGG GGTAAATACC GATTTTAA 

The PSORT algorithm predicts a cytoplasmic location (0.189). 

The protein was expressed in Kcoli and purified as a GST-fusion product, as shown in Figure 11 A. 
This protein was used to immunise mice, whose sera were used in a Western blot (Figure 1 IB) and 
an immunoblot assay (Figure 11C). A his-tagged protein was also expressed. 

These experiments show that cp0019 is a useful immunogen. These properties are not evident from 
the sequence alone. 



Example 12 

The following ^pneumoniae protein (pid 4376466) was expressed <SEQ ID 23; cp6466>: 

1 MRKISVGICl TILLSLSWIi QGCKESSHSS TSRGELAINI RDEPRSLDPR 

51 QVRLIiSEISL VKHIYEGLVQ ENNLSGNIEP ALAEDYSLSS DGLTYTFKLK 

101 SAFWSNGDPI* TAEDFIESWK QVATQEVSGI YAFALNPIKN VRKIQEGHLS 

151 IDHFGVHSPN ESTLWTLES PTSHFLKLLA LPVFFPVHKS QRTLQSKSLP 

201 IASGAFYPKN IKQKQWIKLS KNPHYYNQSQ VETKTITIHF IPDANTAAKL 
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251 FNQGKLNWQG PPWGERIPQE 

301 LNNMKLREAL ASALDKEALV 

351 AQRQAYAKKL FKEALEELQI 

401 KESLGFAIPI VGKEFALLQA 

451 PSGVPPYAIN HKDFLEILQN 

501 YHDAFQFAMN KKLSNIiGVSP 



TLSNLQSKGH 
STIFLGRAKT 
TAKDIiEHLNL 
DLSSGNFSLA 
IEQEQDHQKR 
TGWDFRYAK 



LHSFDVAGTS 
ADHLLPTNIH 
IFPVSSSASS 
TGGWFADFAD 
SELVSQASLY 
EN* 



WLTFNINKFP 
SYPEHQKQEM 
LLVQJblREQW 
PMAFLT I FAY 
LETFHIIEPI 



A predicted signal peptide is highlighted. 



The cp6466 nucleotide sequence <SEQ ID 24> is: 



l 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 



ATGCGCAAGA TATCAGTGGG AATCTGTATC ACCATTCTCC TTAGCCTCTC 
CGTAGTCCTC CAAGGCTGCA AGGAGTCCAG TCACTCCTCT ACATCTCGGG 



GAGAACTCGC TATTAATATA 
CAAGTGCGAC TTCTTTCAGA 
ATTAGTTCAA GAAAATAATC 
AAGACTACTC TCTTTCCTCG 
TCAGCTTTTT GGAGTAATGG 
ATCTTGGAAA CAAGTAGCTA 
CCTTGAATCC AATTAAAAAT 
ATAGACCATT TTGGAGTGCA 
CCTGGAATCC CCAACCTCGC 
TTTTCCCCGT TCATAAATCT 



AGAGATGAAC CCCGTTCTTT AGATCCAAGA 
AATCAGCCTT GTCAAACATA TCTATGAGGG 
TTTCAGGAAA TATAGAGCCT GCTCTTGCAG 
GACGGACTCA CTTATACTTT TAAACTGAAA 
CGACCCCTTA ACAGCTGAAG ACTTTATAGA 
CTCAAGAAGT CTCAGGAATC TATGCTTTTG 
GTACGAAAGA TCCAAGAGGG ACACCTCTCC 
CTCTCCTAAT GAATCTACAC TTGTTGTTAC 
ATTTCTTAAA ACTTTTAGCT CTTCCAGTCT 
CAAAGAACCC TGCAATCCAA ATCTCTACCT 



ATAGCAAGCG GAGCTTTCTA TCCTAAAAAT ATCAAACAAA AACAATGGAT 



AAAACTCTCA AAAAACCCTC ACTACTATAA 
AAACGATTAC GATTCACTTC ATTCCCGATG 
TTTAATCAGG GAAAACTCAA TTGGCAAGGA 
TCCTCAAGAA ACCCTATCCA ATTTACAGTC 
TTGATGTCGC AGGAACCTCA TGGCTCACCT 
CTCAACAATA TGAAGCTTAG AGAAGCCTTA 
AGCTCTTGTC TCAACTATAT TCTTAGGCCG 
TCCTACCTAC AAATATTCAT AGCTATCCCG 
GCACAACGCC AAGC TTACGC TAAAAAACTC 
ACTCCAAATC ACTGCTAAAG ATCTCGAACA 
TTTCCTCGTC AGCAAGTTCT TTACTAGTCC 
AAAGAAAGTT TAGGGTTCGC TATCCCTATT 
TCTCCAAGCA G ACC TATCTT CAGGGAACTT 
GGTTCGCAGA CTTTGCTGAT CCTATGGCAT 
CCATCAGGAG TTCCTCCTTA TGCAATCAAC 
TCTACAAAAC ATAGAACAAG AGCAAGATCA 
TGTCGCAAGC TTCTCTTTAC CTAGAGACCT 
TACCACGACG CATTTCAATT TGCTATGAAT 
AGTCTCACCA ACAGGAGTTG TGGACTTCCG 



TCAAAGTCAG 
CAAACACAGC 
CCTCCTTGGG 
TAAGGGGCAC 
TCAATATCAA 
GCATCAGCCT 



GTGGAAACTA 
AGCAAAACTA 
GAGAACGCAT 
TTACACTCTT 
TAAATTCCCC 
TAGATAAGGA 



TGCAAAAACT GCCGATCATC 
AACATCAAAA ACAAGAGATG 
TTTAAAGAAG CTTTAGAAGA 
TCTTAATCTT ATCTTTCCCG 
AACTTATACG AGAACAGTGG 
GTCGGAAAGG AATTTGCTCT 
CTCTTTAGCT ACAGGAGGAT 
TTCTAACGAT CTTTGC TT AT 
CATAAGGACT TCCTAGAAAT 
CCAAAAACGC TCGGAATTAG 
TTCATATTAT TGAGCCGATC 
AAAAAACTTT CTAATCTAGG 
TTATGCTAAG GAAAATTAG 



The PSORT algorithm predicts that the protein is an outer membrane lipoprotein (0.790). 

The protein was expressed in E.coli and purified both as a GST-fusion product and a His-tag fusion 
product. Purification of the protein as a GST-fusion product is shown in Figure 12A. The 
recombinant proteins were used to immunise mice, whose sera were used in Western blots (Figures 
12B and 12C). FACS analysis was also performed. 

These experiments show that cp6466 is a useful immunogen. These properties are not evident from 
the sequence alone. 



Example 13 

The following C.pneumoniae protein (pid 4376468) was expressed <SEQ ID 25; cp6468>: 

1 MFSKWITLFL LFISLTG CSS YSSKHKQSLI IPIHDDPVAF SPEQAKRAMD 

51 LSIAQLLFDG LTRETHRESN DLELAIASRY TVS EDFC SYT FFIKDSALWS 

101 DGTPITSEDI RNAWEYAQEN SPHIQIFQGL NFSTPSSNAI T IHLDS PNPD 

151 FPKLLAFPAF AIFKPENPKIi FSGPYTLVEY FPGHNIHLKK NPNYYPYHCV 

201 SINSIKLIill PDIYTAIHLL NRGKVDWVGQ PWHQGIPWEL HKQSQYHYYT 

251 YPVEGAFWLC LNTKS PHLND LQNRHRLATC IDKRSIIEEA LQGTQQPAET 
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301 LSRGAPQPNQ YKKQKPLTPQ EKLVLTYPSD ILRCQRIAEI LKEQWKAAGI 
351 DLILEGLEYH LFVNKRKVQD YAIATQTGVA YYPGANLISE EDKLLQNFEI 
401 IPIYYLSYDY LTQDFIEGVI YNASGAVDLK YTYFP* 



A predicted signal peptide is highlighted. 

The cp6468 nucleotide sequence <SEQ ID 26> is: 



1 


lo ± ± ± i\.nv 


51 


a ^ p^5p ^ p^ , ^ , TP^ , 

r\± VJ\^ J.\—\^> ILL 






1 J 1 




o n i 

zu ± 


fi/"" TV 7\ rp/~>0 TV 7V fp 


Z D JL 


/VfvjAV.. 111 X\y 


Jwl 




Sox 




401 


CTCCTTCATC 


451 


TTTCCTAAGC 


501 


CCCGAAGCTC 


551 


ATAAC ATTC A 


601 


TCCATCAACT 


651 


CCACCTCCTA 


701 


AAGGGATTCC 


751 


TATCCTGTAG 


801 


CTTAAATGAT 


851 


GTTCTATCAT 


901 


CTGTCCCGAG 


951 


AACTCCACAA 


1001 


GCCAACGCAT 


1051 


GATTTAATCC 


1101 


AGTCCAAGAC 


1151 


GAGCAAATCT 


1201 


ATCC CGATCT 


1251 


GGGAGTAATC 


1301 


TCCCCTAG 



GATGGATCAC 
TACTCTTCAA 
TGTAGCTTTT 
CCCAACTTCT 
GATTTGGAAT 
CTCTTATACG 
CAATCACCTC 
TCTCCCCACA 
AAATGCAATT 
TTCTTGCCTT 
TTTAGCGGTC 
TTTAAAGAAA 
CCATCAAACT 
AACAGAGGCA 
TTGGGAGCTC 
AAGGTGCCTT 
CTTCAAAACA 
TGAAGAAGCT 
GAGCTCCACA 
GAAAAACTCG 
AGCAGAAATC 
TTGAAGGACT 
TACGCCATAG 
AATTTCTGAA 
ACTATCTGAG 
TATAATGCTT 



C CTCTTTTT A 
AACATAAACA 
TCTCCTGAAC 
TTTTGATGGT 
TAGCGATTGC 
TTCTTTATCA 
CGAAGATATC 
TACAGATCTT 
ACGATTCATC 
TCCTGCATTT 
CGTATACTCT 
AACCCTAACT 
GCTCATTATT 
AGGTGGACTG 
CATAAACAAT 
CTGGCTTTGT 
GACATAGACT 
CTTCAAGGAA 
ACCAAATCAA 
TGCTTACCTA 
TTAAAGGAAC 
CGAATACCAT 
CAACACAGAC 
GAAGACAAGC 
CTATGACTAT 
CTGGAGCTGT 



TTATTCATTA 
ATCTTTAATT 
AAGCAAAACG 
CTGACTAGAG 
CAGTCGCTAT 
AAGACAGCGC 
CGTAACGCTT 
CCAAGGACTT 
TCGACTCGCC 
GCTATCTTTA 
TGTAGAGTAT 
ATTACGACTA 
CCTGATATAT 
GGTAGGACAA 
CGCAATATCA 
CTAAATACAA 
CGCTACTTGT 
CCCAACAACC 
TATAAAAAAC 
TCCCTCAGAT 
AATGGAAAGC 
CTGTTTGTTA 
TGGAGTTGCT 
TCCTGCAAAA 
CTCACTCAAG 
AGATCTCAAA 



GCCTTACTGG 
ATTC C CAT AC 
GGCCATGGAC 
AAAC TCATCG 
ACAGTCTCTG 
TTTATGGAGC 
GGGAGTATGC 
AACTTCTCAA 
CAACCCCGAT 
AACCAGAAAA 
TTCCCAGGGC 
CCACTGCGTC 
ATACAGCCAT 
CCCTGGCATC 
CTACTACACC 
AATCCCCACA 
ATTGATAAAC 
AGCGGAAACA 
AAAAGCCTCT 
ATTCTAAGAT 
TGC TGGAAT A 
ACAAACGAAA 
TATTACCCAG 
CTTTGAGATT 
ATTTTATAGA 
TATACCTATT 



The PSORT algorithm predicts that this protein is an outer membrane lipoprotein (0.790). 

The protein was expressed in Kcoli and purified as a GST- fusion product, as shown in Figure 13 A. 
The recombinant protein was used to immunise mice, whose sera were used in a Western blot 
(Figure 13B) and for FACS analysis. A his-tagged protein was also expressed. 

These experiments show that cp6468 is a useful immunogen. These properties are not evident from 
the sequence alone. 



Example 14 



The following C. pneumoniae protein (pid 4376469) was expressed <SEQ ID 27; cp6469>: 

1 MKMHRLKPTL KSLIPNLLFL LLTLSS CSKQ KQEPLGKHLV IAMSHDLADL 

51 DPRNAYLSRD AS LAICAL YEG LTRETDQGIA LALAESYTLS KDHKVYTFKL 

101 RPSVWSDGTP LTAYDFEKSI KQLYFEEFSP SIHTLLGVIK NSSAIHNAQK 

151 SLETLGIQAK DDLTLVITLE QPFPYFLTLI ARPVFSPVHH TLRESYKKGT 

201 PPSTYISNGP FVLKKHEHQN YLILEKNPHY YDHESVKLDR VTLKIIPDAS 

251 TATKLFKSKS IDWIGSPWSA PISNEDQKVL SQEKILTYSV SSTTLLIYNL 

301 QKPLIQNKAL RKAIAHAIDR KSILRLVPSG QEAVTLVPPN LSQLNLQKEI 

351 STEERQTKAR AYFQEAKETL SEKELAELSI LYPIDSSNSS IIAQEIQRQL 

401 KDTLGLKIKI QGMEYHCFLK KRRQGDFF I A TGGWIAEYVS PVAFLSILGN 

451 PRDLTQWRNS DYEKTLEKLY LPHAYKENLK RAEMIIEEET PIIPLYHGKY 

501 IYAIHPKIQN TFGSLLGHTD LKNIDILS* 

A predicted signal peptide is highlighted. 



The cp6469 nucleotide sequence <SEQ ID 28> is: 
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1 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 



ATGAAGATGC 
TCTTTTCTTA 
CCTTAGGAAA 
GATCCTCGCA 
CTATGAAGGA 
CAGAAAGTTA 
AGACCTTCTG 
AAAATCTATA 
CTTTACTCGG 
TCTCTGGAAA 
TACCCTAGAG 
TATTCTCCCC 
CCCCCATCCA 
ACACCAAAAC 
AATCAGTAAA 
ACAGCCACGA 
TTGGAGCGCT 
AGATTCTTAC 
CAAAAACCTC 
TATTGATAGA 
TAACTCTAGT 
TCAACAGAAG 
AGAAACACTT 
TAGATTCCTC 
AAAGATACCT 
CTTTTTAAAG 
GGATTGCGGA 
CCCAGAGACC 
GAAACTCTAT 
TGATAATAGA 
ATTTACGCTA 
CCACACAGAT 



ATAGGCTTAA 
TTGCTCACTC 
ACATCTCGTT 
ATGCCTATTT 
CTGACAAGAG 
TACCCTGTCA 
TGTGGAGCGA 
AAACAACTGT 
CGTGATTAAA 
CTCTTGGGAT 
CAACCTTTCC 
TGTTCATCAC 
CATACATCTC 
TACTTAATTT 
GTTAGACCGA 
AACTTTTCAA 
CCGATATCTA 
CTATTCTGTT 
TAATACAAAA 
AAATCTATCT 
TCCCCCAAAT 
AACGACAAAC 
TCTGAAAAAG 
GAATTCCTCC 
TAGGATTGAA 
AAACGTCGTC 
ATACGTAAGC 
TCACACAATG 
CTCCCTCATG 
AGAAGAAACC 
TACATCCTAA 
C TC AAAAATA 



ACCTACCTTA 
TTTCAAGCTG 
ATTGCGATGA 
AAGCAGAGAT 
AAACTGATCA 
AAAGATCATA 
TGGCACTCCA 
ACTTCGAAGA 
AATTCTTCGG 
ACAGGCAAAA 
CATACTTTCT 
ACCCTTAGGG 
CAATGGGCCC 
TAGAAAAAAA 
GTCACCTTAA 
AAGTAAATCT 
ACGAAGACCA 
TCAAGCACCA 
TAAAGCCCTC 
TAAGACTCGT 
CTTTCACAAC 
AAAAGC C AG A 
AACTCGCAGA 
ATCATAGCTC 
AATCAAAATC 
AAGGAGATTT 
CCCGTAGCCT 
GAGAAACAGT 
CCTACAAAGA 
CCGATTATCC 
AATC CAGAAT 
TCGATATCTT 



AAAAGTCTGA 
CTCAAAGCAA 
GCCATGATCT 
GCTTCCCTAG 
AGGAATCGCA 
AGGTCTATAC 
CTCACTGCTT 
ATTTTCACCT 
C AATC C AC AA 
GATGATCTTA 
CACACTTATC 
AATCCTATAA 
TTTGTCTTAA 
TCCTCACTAC 
AAATTATCCC 
ATAGATTGGA 
AAAAGTTCTC 
CCCTTCTTAT 
AGGAAAGCCA 
GCCTTCAGGA 
TCAATCTTCA 
GCATATTTTC 
ACTCAGCATC 
AAGAAATCCA 
CAAGGCATGG 
CTTCATAGCG 
TCCTATCTAT 
GATTACGAAA 
GAATTTAAAA 
CCCTGTATCA 
ACATTCGGAT 
AAGTTAG 



TCCCTAATCT 
AAACAAGAAC 
CGCCGACCTA 
CAAAAGCCCT 
CTGGCTC TTG 
CTTTAAACTC 
ATGACTTTGA 
TCCATACATA 
TGCTCAAAAA 
CTTTGGTGAT 
GCTCGCCCCG 
GAAAGGAACA 
AAAAACATGA 
TATGATCATG 
AGACGCCTCC 
TTGGCTCACC 
TCCCAAGAAA 
CTATAACCTG 
TTGCTCATGC 
CAAGAAGCTG 
AAAAGAGATC 
AAGAAGCTAA 
CTCTATCCTA 
AAGACAACTT 
AGTACCACTG 
ACAGGAGGAT 
TCTAGGCAAC 
AGACTTTAGA 
CGCGCAGAAA 
CGGCAAATAT 
CTCTTCTAGG 



The PSORT algorithm predicts a periplasmic location (0.934). 

The protein was expressed in E.coli and purified as a GST-fusion product, as shown in Figure 14A. 
The recombinant protein was used to immunise mice, whose sera were used in a Western blot 
(Figure 14B) and for FACS analysis. A his-tagged protein was also expressed. 

These experiments show that cp6469 is a useful immunogen. These properties are not evident from 
the sequence alone. 



Example 15 



The following C.pneumoniae protein (pid 4376602) was expressed <SEQ ID 29; cp6602>: 

1 MAASGGTGGLi GGTQGVNLAA VEAAAAKADA AEWASQEGS EMNMIQQSQD 

51 LTNPAAATRT KKKEEKFQTL ESRKKGEAGK AEKKSESTEE KPDTDLADKY 

101 ASGNSEISGQ ELRGLRDAIG DDASPEDILA LVQEKIKDPA LQSTALDYLV 

151 QTTPPSQGKL KEALIQARNT HTEQFGRTA I GAKNILPASQ EYADQLNVSP 

201 SGLRSLYLEV TGDTHTCDQIi LSMLQDRYTY QDMAIVSSFL MKGMATELKR 

251 QGPYVPSAQL QVLMTETRNIj QAVLTSYDYF ESRVPILLDS LKAEGIQTPS 

301 DLNFVKVAES YHKIINDKFP TASKVEREVR NLIGDDVDSV TGVLNLFFSA 

351 LRQTSSRLFS SADKRQQLGA M I ANALDAVN INNEDYPKAS DFPKPYPWS* 

The cp6602 nucleotide sequence <SEQ ID 30> is: 

1 ATGGCAGCAT CAGGAGGCAC AGGTGGTTTA GGAGGCACTC AGGGTGTCAA 

51 CCTTGCAGCT GTAGAAGCTG CAGCTGCAAA AGCAGATGCA GCAGAAGTTG 

101 TAGCCAGCCA AGAAGGTTCT GAGATGAACA TGATTCAACA ATCTCAGGAC 

151 CTGACAAATC CCGCAGCAGC AACACGCACG AAAAAAAAGG AAGAGAAGTT 

201 TCAAACTCTA GAATCTCGGA AAAAAGGAGA AGCTGGAAAG GCTGAGAAAA 

251 AATCTGAATC TACAGAAGAG AAGCCTGACA CAGATCTTGC TGATAAGTAT 

301 GCTTCTGGGA ATTCTGAAAT CTCTGGTCAA GAACTTCGCG GCCTGCGTGA 

351 TGCAATAGGA GACGATGCTT CTCCAGAAGA CATTCTTGCT CTTGTACAAG 
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401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 



AGAAAATTAA 
CAAACGACTC 
AAGGAATACT 
ACATCTTATT 
TCAGGGCTTC 
TGATCAGCTA 
CTATTGTCAG 
CAGGGTCCCT 
TCGTAACCTG 
TTCCTATTTT 
GATCTAAACT 
TAAGTTCCCA 
GAGACGATGT 
TTACGTCAAA 
ATTAGGAGCT 
AAGATTATCC 



AGACCCAGCT 
CACCCTCCCA 
CATACGGAGC 
TGCCTCTCAA 
GCTCTTTGTA 
CTTTCTATGC 
CTCCTTTCTA 
ACGTACCCAG 
CAAGCAGTTC 
ACTC GATAGC 
TTGTGAAGGT 
ACAGCATCTA 
TGATTCTGTG 
CGTCGTCACG 
ATGATTGCTA 
CAAAGCATCA 



CTGCAATCCA 
AGGTAAATTA 
AATTCGGACG 
GAATATGCAG 
CTTAGAAGTG 
TTCAAGACCG 
ATGAAAGGAA 
TGCGCAACTA 
TTACCTCGTA 
TTAAAAGCTG 
AGCTGAGTCC 
AAGTAGAACG 
ACCGGTGTCT 
CCTTTTCTCT 
ATGCTTTAGA 
GACTTCCCTA 



CAGCTTTGGA 
AAAGAAGCGC 
AACTGCTATT 
ACCAACTGAA 
ACTGGAGACA 
CTATAC CTAC 
TGGCAACAGA 
CAAGTTCTCA 
CGATTACTTT 
AGGGAATCCA 
TACCATAAAA 
AGAAGTCCGC 
TGAACTTATT 
TCAGCAGACA 
TGCTGTAAAT 
AACCCTATCC 



CTACCTGGTT 
TTATCCAAGC 
GGTGCGAAAA 
TGTTTCTCCT 
CACATACCTG 
CAAGATATGG 
ATTAAAAAGG 
TGACAGAAAC 
GAAAGTCGCG 
AACTCCTTCT 
TCATTAACGA 
AATC TC AT AG 
CTTTTCTGCT 
AACGTCAGCA 
ATAAACAATG 
TTGGTCATGA 



The PSORT algorithm predicts a cytoplasmic location (0.080). 



The protein was expressed in E.coli and purified as both a His-tag and a GST-fusion product, as 
shown in Figure 15 A. The recombinant proteins were used to immunise mice, whose sera were used 
in a Western blot (Figure 15B) and for FACS analysis (Figure 15C). 

The cp6602 protein was also identified in the 2D-PAGE experiment (Cpn0324). 

These experiments show that cp6602 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 



Example 16 

The following ^pneumoniae protein (pid 

1 MKYSIiPWIAT SSALVF SLHP 

51 DAS GTTYTLT SDVSITNVSA 

101 ALTHDGAA IN NTNTAXiSFSG 

151 ATFTDNASVT LQKNT SEKDG 

201 LCSTANTTVQ GNSGTVTFSS 

251 NTAKTGGAWS SDDNLALTGN 

301 LATATDKTGL AISQNQEMSF 

351 TATAGCGGAI YTETEDFSLK 

401 TNLLFSGNKA TGPSNSSANQ 

451 SLTSNAATVS GGAIYATKCT 

501 TGSTGTVTFS TNTAKTGGAL 

551 QEGCGGAILS FLESASVSTK 

601 ALHGNTTIiTF DGNTAETAGG 

651 L.HTKGNT S FT KNKALVFSGN 

701 KSLTLTENES LSFINNTAKR 

751 AIYSKNLS IT ANGPVSFTNN 

801 RATEGTSTPN SIHLGAGAKI 

851 LVINPWKAI VPPPQPKNGP 

901 ASIPANTTTI LNQKINLAGG 

951 LETTTTNNTD GSIDLKNLSV 

1001 HNNEGSFYDN PGLKANLNLP 

1051 QGSWTLVPKV GAGGKVTLVA 

1101 SIQQEIATAM SDAPSHPGIW 

1151 SMTTPQEYTF AVAFSQLFGK 

1201 SLRRHVTjSKV LPEIiPGETPL 

1251 SH SFAVEVGG SLPVDLNYRY 

1301 DASHLVNVSI PMGIjTFKHES 

1351 GTSWSTFATN LSRQAFFAEA 

1401 CGTRYSF* 



4376727) was expressed <SEQ ID 31; cp6727>: 

LMAANTDL S S SDNYENGSSG SAAFTAKETS 
ITPADKSCFT NTGGALSFVG ADHSLVLQTI 
FSSLLIDSAP ATGT SGGKGA ICVTNTEGGT 
AAVSAYSIDL AKTTTAALLD QNT STKNGGA 
NTATDKGGGI YSKEKDSTLD ANTGWTFKS 
TQVLFQENKT TGSAAQANNP EGCGGAICCY 
TSNTTTANGG AIYATKCTLD GNTTLTFDQN 
GSTGTVTFST NTAKTGGALY SKGNSSLTGN 
EGCGGA I LAF IDSGSVSDKT GLSIANNQEV 
LTGNGSLTFD GNTAGT SGG A IYTETEDFTL 
YSKGNNSLSG NTNLLFSGNK ATGPSNSSAN 
KGLWIEDNEN VSL SGNTATV SGGA I YATKC 
AIYTETEDFT LTGSTGTVTF STNTAKTAGA 
SATATATTTT DQEGCGGA 1 1» CNISESDIAT 
SGGG I YAPKC VISGSESINF DGNTAETSGG 
SGGKGGAIYI AD SG ELS LEA IDGDITFSGN 
TKLAAAPGHT IYFYDPITME APASGGTIEE 
IASVPWPVA PANPNTGTIV FSSGKLPSQD 
NWLKEGATL QVYSFTQQPD STVFMDAGTT 
NLDALDGKRM ITIAVNSTSG GLKISGDLKF 
FLDLSSTSGT VNLDDFNPIP SSMAAPDYGY 
EWQALGYTPK PELRATLVPN SLWNAYVNIH 
IGGIGNAFHQ DKQKENAGFR LISRGYIVGG 
SKDYWSDIK SQVYAGSLCA QSSYVIPLHS 
VLHGQVSYGR NHHNMTTKLA NNTQGKSDWD 
LTSYSPYVKL QWSVNQKGF QEVAADPRIF 
AKPPSALLLT LGYAVDAYRD HPHCLTSLTN 
SGHLKLLHGL DCFASGSCEL RSSSRSYNAN 



A predicted signal peptide is highlighted. 
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The cp6727 nucleotide sequence <SEQ ID 32> is: 

1 ATGAAATATT CTTTACCTTG GCTACTTACC TCTTCGGCTT TAGTTTTCTC 

51 CCTACATCCA CTAATGGCTG CTAACACGGA TCTCTCATCA TCCGATAACT 

101 ATGAAAATGG TAGTAGTGGT AGCGCAGCAT TCACTGCCAA GGAAACTTCG 

151 GATGCTTCAG GAACTACCTA CACTCTCACT AGCGATGTTT CTATTACGAA 

201 TGTATCTGCA ATTACTCCTG CAGATAAAAG CTGTTTTACA AACACAGGAG 

251 GAGCATTGAG TTTTGTTGGA GCTGATCACT CATTGGTTCT GCAAACCATA 

301 GCGCTTACGC ATGATGGTGC TGCAATTAAC AATACCAACA CAGCTCTTTC 

351 TTTCTCAGGA TTCTCGTCAC TCTTAATCGA CTCAGCTCCA GCAACAGGAA 

401 CTTCGGGCGG CAAGGGTGCT ATTTGTGTGA CAAATACAGA GGGAGGTACT 

451 GCGACTTTTA CTGACAATGC CAGTGTCACC CTCCAAAAAA ATACTTCAGA 

501 AAAAGATGGA GCTGCAGTTT CTGCCTACAG CATCGATCTT GCTAAGACTA 

551 CGACAGCAGC TCTCTTAGAT CAAAATACTA GCACAAAAAA TGGCGGGGCC 

601 CTCTGTAGTA CAGCAAACAC TACAGTCCAA GGAAACTCAG GAACGGTGAC 

651 CTTCTCCTCA AATACTGCTA CAGATAAAGG TGGGGGGATC TACTCAAAAG 

701 AAAAGGATAG CACGCTAGAT GCCAATACAG GAGTCGTTAC CTTCAAATCT 

751 AATACTGCAA AGACGGGGGG TGCTTGGAGC TCTGATGACA ATCTTGCTCT 

801 TACCGGCAAC ACTCAAGTAC TTTTTCAGGA AAATAAAACA ACCGGCTCAG 

851 CAGCACAGGC AAATAACCCG GAAGGTTGTG GTGGGGCAAT CTGTTGTTAT 

901 C TTGCT AC AG CAACAGACAA AACTGGATTA GCCATTTCTC AGAATCAAGA 

951 AATGAGCTTC ACTAGTAATA CAACAACTGC GAATGGTGGA GCGATCTACG 

1001 CTACTAAATG TACTCTGGAT GGAAACACAA CTCTTACCTT CGATCAGAAT 

1051 AC TGCGACAG CAGGATGTGG CGGAGCTATC TATACAGAAA CTGAAGATTT 

1101 TTCTCTTAAG GGAAGTACGG GAACCGTGAC CTTCAGCACA AATACAGCAA 

1151 AGACAGGCGG CGCCTTATAT TCTAAAGGAA ACAGCTCGCT GACTGGAAAT 

1201 ACCAACCTGC TCTTTTCAGG GAACAAAGCT ACGGGCCCGA GTAATTCTTC 

1251 AGCAAATCAA GAGGGTTGCG GTGGGGCAAT CCTAGCCTTT ATTGATTCAG 

1301 GATCCGTAAG CGATAAAACA GGACTATCGA TTGCAAACAA CCAAGAAGTC 

1351 AGCCTCACTA GTAATGCTGC AACAGTAAGT GGTGGTGCGA TCTATGCTAC 

1401 CAAATGTACT CTAACTGGAA ACGGCTCCCT GAC CTTTGAC GGCAATACTG 

1451 CTGGAACTTC AGGAGGGGCG ATCTATACAG AAACTGAAGA TTTTACTCTT 

1501 ACAGGAAGTA CAGGAACCGT GACCTTCAGC ACAAATACAG CAAAGACAGG 

1551 CGGCGCCTTA TATTCTAAAG GCAACAACTC TCTGTCTGGT AATACCAACC 

1601 TGCTCTTTTC AGGGAACAAA GCTACGGGCC CGAGTAATTC TTCAGCAAAT 

1651 CAAGAGGGTT GCGGTGGGGC AATCCTATCG TTTCTTGAGT CAGCATCTGT 

1701 AAGTACTAAA AAAGG AC TCT GGATTGAAGA TAACGAAAAC GTGAGTCTCT 

1751 CTGGTAATAC TGCAACAGTA AGTGGCGGTG CGATCTATGC GACCAAGTGT 

1801 GCTCTGCATG GAAACACGAC TCTTACCTTT GATGGCAATA CTGCCGAAAC 

1851 TGCAGGAGGA GCGATCTATA CAGAAAC CGA AGATTTTACT CTTACGGGAA 

1901 GTACGGGAAC CGTGACCTTC AGCACAAATA CAGCAAAGAC AGCAGGGGCT 

1951 CTACATACTA AAGGAAATAC TTCCTTTACC AAAAATAAGG CTCTTGTATT 

2001 TTCTGGAAAT TCAGCAACAG CAACAGCAAC AACAACTACA GATCAAGAAG 

2051 GTTGTGGTGG AGCGATCCTC TGTAATATCT CAGAGTCTGA CATAGCTACA 

2101 AAAAGCTTAA CTCTTACTGA AAATGAGAGT TTAAGTTTCA TTAACAATAC 

2151 GGC AAAAAGA AGTGGTGGTG GTATTTATGC TCCTAAGTGT GTAATCTCAG 

2201 GCAGTGAATC CATAAAC TTT GATGGCAATA CTGCTGAAAC TTCGGGAGGA 

2251 GCGATTTATT CGAAAAACCT TTCGATTACA GCTAACGGTC CTGTCTCCTT 

2301 TACCAATAAT TCTGGAGGCA AGGGAGGCGC CATTTATATA GCCGATAGCG 

2351 GAGAACTTTC CTTAGAGGCT ATTGATGGGG ATATTACTTT CTCAGGGAAC 

2401 CGAGCGACTG AGGGAACTTC AACTCCCAAC TCGATCCATT TAGGTGCAGG 

2451 GGCTAAGATC ACTAAGCTTG CAGCAGCTCC TGGTCATACG ATTTATTTTT 

2501 ATGATCCTAT TACGATGGAA GCTCCTGCAT CTGGAGGAAC AATAGAGGAG 

2551 TTAGTCATCA ATCCTGTTGT CAAAGCTATT GTTCCTCCTC CCCAACCAAA 

2601 AAATGGTCCT ATAGCTTCAG TGCCTGTAGT CCCTGTAGCA CCTGCAAACC 

2651 CAAACACGGG AACTATAGTA TTTTCTTCTG GAAAACTCCC CAGTCAAGAT 

2701 GCCTCGATTC CTGCAAATAC TACCACCATA CTGAACCAGA AGATCAACTT 

2751 AGCAGGAGGA AATGTCGTTT TAAAAGAAGG AGCCACCCTA CAAGTATATT 

2801 CCTTCACACA GCAGCCTGAT TCTACAGTAT TCATGGATGC AGGAACGACC 

2851 TTAGAGACCA CGACAACTAA CAATACAGAT GGCAGCATCG ATCTAAAGAA 

2901 TCTCTCTGTA AATCTGGATG CTTTAGATGG CAAGCGTATG ATAACGATTG 

2951 CCGTAAACAG CACAAGTGGG GGATTAAAAA TCTCAGGGGA TCTGAAATTC 

3001 CATAACAATG AAGGAAGTTT CTATGACAAT CCTGGGTTGA AAGCAAACTT 

3051 AAATCTTCCT TTCTTAGATC TTTCTTCTAC TTCAGGAACT GTAAATTTAG 

3101 ACGACTTCAA TCCGATTCCT TCTAGCATGG CTGCTCCGGA TTATGGGTAT 

3151 CAAGGGAGTT GGACTCTGGT TCCTAAAGTA GGAGCTGGAG GGAAGGTGAC 

3201 TTTGGTCGCG GAATGGCAAG CGTTAGGATA CACTCCTAAA CCAGAGCTTC 

3251 GTGCGACTTT AGTTCCTAAT AGCCTTTGGA ATGCTTATGT AAACATCCAT 
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3301 
3351 
3401 
3451 
3501 
3551 
3601 
3651 
3701 
3751 
3801 
3851 
3901 
3951 
4001 
4051 
4101 
4151 
4201 



TCTATACAGC 
AGGGATTTGG 
AGGAAAATGC 
AGCATGACCA 
CTTTGGCAAA 
ATGCAGGATC 
TCATTACGTC 
AACTCCCCTT 
ATATGACGAC 
AGCCATAGCT 
CTACAGATAC 
GTGTAAATCA 
GACGCTAGCC 
ACACGAATCA 
CTGTAGATGC 
GGCACCTCGT 
TGCTGAGGCT 
CTTCTGGAAG 
TGTGGAACTC 



AGGAGATCGC 
ATTGGAGGTA 
AGGATTCCGT 
CCCCTCAAGA 
TCTAAGGATT 
TCTCTGTGCT 
GCCACGTCCT 
GTTCTCCATG 
AAAGCTTGCG 
TCGCTGTTGA 
CTTACCAGCT 
AAAAGGATTC 
ATCTGGTCAA 
GCAAAGCCCC 
TTACCGGGAT 
GGTCTACGTT 
TCTGGACATC 
TTGTGAACTG 
GTTATTCTTT 



CACTGCGATG 
TTGGCAACGC 
TTGATTTCCA 
ATATACCTTT 
ACGTAGTCTC 
CAGAGCTCTT 
CTCTAAGGTC 
GTCAAGTTTC 
AACAACACAC 
AGTCGGTGGT 
ACTCTCCCTA 
CAAGAGGTTG 
CGTGTCTATC 
CCAGTGCTTT 
CACCCTCACT 
TGCTACAAAC 
TGAAGTTACT 
CGCAGCTCCT 
CTAA 



TCGGACGCTC 
CTTCCATCAA 
GAGGTTATAT 
GCTGTTGCAT 
GGATATTAAA 
ATGTCATTCC 
CTTCCAGAGC 
CTATGGAAGA 
AAGGGAAATC 
TCTCTTCCTG 
TGTGAAACTC 
CTGCTGATCC 
CCTATGGGAC 
GCTTCTTACT 
GCCTGACCTC 
TTATCACGAC 
TCATGGTCTT 
CAAGAAGCTA 



CCTCACATCC 
GACAAGCAAA 
TGTTGGTGGC 
TCAGCCAACT 
TCTCAAGTCT 
CCTGCATAGC 
TCCCAGGAGA 
AACCACCATA 
AG AC TGGGAC 
TAGATCTAAA 
CAAGTTGTGA 
ACGTATCTTT 
TCACCTTCAA 
TTAGGTTACG 
CTTAACAAAT 
AAGCTTTCTT 
GACTGCTTCG 
TAATGCAAAC 



The PSORT algorithm predicts an outer membrane location (0.915). 

The protein was expressed in E.coli and purified as a his-tag product, as shown in Figure 16A. The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
16B) and for FACS analysis (Figure 16C). A GST-fusion protein was also expressed. 

The cp6727 protein was also identified in the 2D-PAGE experiment (Cpn0444). 

These experiments show that cp6727 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 



Example 17 

The following Cpneumoniae protein (pid 4376731) was expressed <SEQ ID 33; cp6731>: 

1 MKSSLHWFLI SSSLALPLSL NFSAFA AWE INLGPTNSFS GPGTYTPPAQ 

51 TTNADGTIYN LTGDVSITNA GSPTALTASC FKETTGNLSF QGHGYQFLLQ 

101 NIDAGANCTF TNT AANKLL S FSGFSYLSLI QTTNATTGTG AIKSTGACSI 

151 QSNYSCYFGQ NFSNDNGGAL QGSSISLSLN PNLTFAKNKA TQKGGALYST 

201 GGITINNTLN S AS F SENTAA NNGGAIYTEA SSFISSNKAI SFINNSVTAT 

251 SATGGAIYCS STSAPKPVLT LSDNGELNFI GNTAITSGGA IYTDNLVLS S 

301 GGPTLFKNNS AIDTAAPLGG AIAIADSGSL SLSALGGDIT FEGNTWKGA 

351 SSSQTTTRNS INIGNTNAKI VQLRASQGNT IYFYDPITTS ITAALSDALN 

401 LNGPDLiAGNP AYQGTIVFSG EKLSEAEAAE ADNLKSTIQQ PLTLAGGQLS 

451 LKSGVTLVAK SFSQSPGSTIi LMDAGTTLET ADGITINNLV LNVDSLKETK 

501 KATLKATQAS QTVTLSGSLS LVDPSGNVYE DVSWNNPQVF SCLTLTADDP 

551 AN IH ITDLAA DPLEKNPIHW GYQGNWALSW QEDTATKSKA ATL.TWTKTGY 

601 NPNPERRGTL VANTLWGSFV DVRSIQQLVA TKVRQSQETR GIWCEGISNF 

651 FHKDSTKINK GFRHI SAGYV VGATTTLASD NLITAAFCQL FGKDRDHFIN 

701 KNRASAYAAS LHLQHLATLS SPSLLRYLPG SESEQPVLFD AQISYIYSKN 

751 TMKTYYTQAP KGE S SWYNDG CALEIiASSLP HTALSHEGLF HAYFPFIKVE 

801 ASYIHQDSFK ERNTTLVRSF DSGDLINVSV PIGITFERFS RNERASYEAT 

851 VIYVADVYRK NPDCTTAIiLI NNT SWKTTGT NliSRQAGIGR AGIFYAFSPN 

901 LEVTSNLSME IRGSSRSYNA DLGGKFQF* 



A predicted signal peptide is highlighted. 

The cp6731 nucleotide sequence <SEQ ID 34> is: 



1 ATGAAATCCT CTCTTCATTG GTTTTTAATC TCGTCATCTT TAGCACTTCC 

51 CTTGTCACTA AATTTCTCTG CGTTTGCTGC TGTTGTTGAA ATCAATCTAG 

101 GACCTACCAA TAGCTTCTCT GGACCAGGAA CCTACACTCC TCCAGCCCAA 

151 AC AAC AAATG CAGATGGAAC TATCTATAAT CTAACAGGGG ATGTCTCAAT 

201 CACCAATGCA GGATCTCCGA CAGCTCTAAC CGCTTCCTGC TTTAAAGAAA 
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251 CTACTGGGAA TCTTTCTTTC CAAGGCCACG GCTACCAATT TCTCCTACAA 

301 AATATCGATG CGGGAGCGAA CTGTACCTTT ACCAATACAG CTGCAAATAA 

351 GCTTCTCTCC TTTTCAGGAT TCTCCTATTT GTCACTAATA CAAACCACGA 

401 ATGCTACCAC AGGAACAGGA GCCATCAAGT CCACAGGAGC TTGTTCTATT 

451 CAGTCGAACT ATAGTTGCTA CTTTGGCCAA AACTTTTCTA ATGACAATGG 

501 AGGCGCCCTC CAAGGCAGCT CTATCAGTCT ATCGCTAAAC CCCAACCTAA 

551 CGTTTGCCAA AAACAAAGCA ACGCAAAAAG GGGGTGCCCT CTATTCCACG 

601 GGAGGGATTA CAATTAACAA TACGTTAAAC TCAGCATCAT TTTCTGAAAA 

651 TACCGCGGCG AACAATGGCG GAGCCATTTA CACGGAAGCT AGCAGTTTTA 

7 01 TTAGCAGCAA CAAAGCAATT AGCTTTATAA ACAATAGTGT GACCGCAACC 

751 TCAGCTACAG GGGGAGCCAT TTACTGTAGT AGT AC ATCAG CCCCCAAACC 

801 AGTCTTAACT C TATC AG AC A ACGGGGAACT GAACTTTATA GGAAATACAG 

851 CAATTACTAG TGGTGGGGCG ATTTATACTG ACAATCTAGT TCTTTCTTCT 

901 GGAGGACCTA CGCTTTTTAA AAACAACTCT GCTATAGATA CTGCAGCTCC 

951 CTTAGGAGGA GCAATTGCGA TTGCTGACTC TGGATCTTTG AGTCTTTCGG 

1001 CTCTTGGTGG AGACATCACT TTTGAAGGAA ACACAGTAGT CAAAGGAGCT 

1051 TCTTCGAGTC AGACCACTAC CAGAAATTCT ATTAACATCG GAAACACCAA 

1101 TGCTAAGATT GTACAGCTGC GAGCCTCTCA AGGCAATACT ATCTACTTCT 

1151 ATGATCCTAT AACAACTAGC ATC ACTGCAG CTCTCTCAGA TGCTCTAAAC 

1201 TTAAATGGTC CTGACCTTGC AGGGAATCCT GCATATCAAG GAACCATCGT 

1251 ATTTTCTGGA GAGAAGCTCT CGGAAGCAGA AGCTGCAGAA GCTGATAATC 

1301 TCAAATCTAC AATTCAGCAA CCTCTAACTC TTGCGGGAGG GCAACTCTCT 

1351 CTTAAATCAG GAGTCACTCT AGTTGCTAAG TCCTTTTCGC AATCTCCGGG 

1401 CTCTACCCTC CTCATGGATG CAGGGACCAC ATTAGAAACC GCTGATGGGA 

1451 TCACTATCAA TAATCTTGTT CTCAATGTAG ATTCCTTAAA AG AG AC C AAG 

1501 AAGGCTACGC TAAAAGCAAC ACAAGCAAGT CAGACAGTCA CTTTATCTGG 

1551 ATCGCTCTCT CTTGTAGATC CTTCTGGAAA TGTCTACGAA GATGTCTCTT 

1601 GGAATAACCC TCAAGTCTTT TCTTGTCTCA CTCTTACTGC TGACGACCCC 

1651 GCGAATATTC ACATCACAGA CTTAG CTGCT GATCCCCTAG AAAAAAATCC 

1701 TATCCATTGG GGATACCAAG GGAATTGGGC ATTATCTTGG CAAGAGGATA 

1751 CTGCGACTAA ATCCAAAGCA GCGACTCTTA CCTGGACAAA AACAGGATAC 

1801 AATC CGAATC CTGAGCGTCG TGGAACCTTA GTTGCTAACA CGCTATGGGG 

1851 ATCCTTTGTT GATGTGCGCT CCATACAACA GCTTGTAGCC ACTAAAGTAC 

1901 GCCAATCTCA AGAAACTCGC GGCATCTGGT GTGAAGGGAT CTCGAAC TTC 

1951 TTCCATAAAG ATAGCACGAA GATAAATAAA GGTTTTCGCC ACATAAGTGC 

2001 AGGTTATGTT GTAGGAGCGA CTACAACATT AGCTTCTGAT AATCTTATCA 

2051 CTGCAGCCTT CTGCCAATTA TTCGGGAAAG ATAGAGATCA CTTTATAAAT 

2101 AAAAATAGAG CTTCTGCCTA TGCAGCTTCT CTCCATCTCC AGCATCTAGC 

2151 GACCTTGTCT TCTCCAAGCT TGTTACGCTA CCTTCCTGGA TCTGAAAGTG 

2201 AGCAGCCTGT C CTCTTTGAT GCTCAGATCA GCTATATCTA TAGTAAAAAT 

2251 ACTATGAAAA CCTATTACAC CCAAGCACCA AAGGGAGAGA GCTCGTGGTA 

2301 TAATGACGGT TGCGCTCTGG AACTTGCGAG CTCCCTACCA CACACTGCTT 

2351 TAAGCCATGA GGGTCTCTTC CACGCGTATT TTCCTTTCAT CAAAGTAGAA 

2401 GCTTCGTACA TACACCAAGA TAGCTTCAAA GAACGTAATA CTACCTTGGT 

2451 ACGATCTTTC GATAGCGGTG ATTTAATTAA CGTCTCTGTG CCTATTGGAA 

2501 TTACCTTCGA GAGATTCTCG AGAAACGAGC GTGCGTCTTA CGAAGCT AC T 

2551 GTCATCTACG TTGCCGATGT CTATCGTAAG AATCCTGACT GCACGACAGC 

2601 TCTCCTAATC AACAATACCT CGTGGAAAAC TACAGGAACG AATCTCTCAA 

2651 GACAAGCTGG TATCGGAAGA GCAGGGATCT TTTATGCCTT CTCTCCAAAT 

2701 CTTGAGGTCA CAAGTAACCT ATCTATGGAA ATTCGTGGAT CTTCACG C AG 

2751 CTACAATGCA GATCTTGGAG GTAAGTTCCA GTTCTAA 

The PSORT algorithm predicts an outer membrane location (0.926). 

The protein was expressed in Kcoli and purified as a his-tag product, as shown in Figure 17A. A 
GST-fusion protein was also expressed. The recombinant proteins were used to immunise mice, 

whose sera were used in a Western blot (Figure 17B; his-tag) and for FACS analysis (Figure 17C; 
his-tag and GST-fusion). 



The GST-fusion protein also showed good cross-reactivity with human sera, including sera from 
patients with pneumonitis. Less cross-reactivity was seen with the his-fusion. 
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These experiments show that cp6731 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 18 

The following C.pneumoniae protein (pid 4376737) was expressed <SEQ ID 35; cp6737>: 

1 MPLSFKSSSF C LIiACLC S A3 CAFAETRLGG NFVPPITNQG EEILLTSDFV 

51 CSNFLGASFS SSFINSSSNL SLLGKGLSLT FTSCQAPTNS NY ALL S AAET 

101 LTFKNFSSIN FTGNQSTGLG GLIYGKDIVF QSIKDLIFTT NRVAYSPASV 

151 TTSATPAITT VTTGASALQP TDSLTVENIS QSIKFFGNLA NFGSAISSSP 

201 TAWKFINNT ATMSFSHNFT SSGGGVIYGG SSLLFENNSG CIIFTANSCV 

251 NSLKGVTPSS GTYALGSGGA ICIPTGTFEL KNNQGKCTFS YNGT PNDAGA 

301 IYAETCNIVG NQGALLLDSN TAARNGGAIC AKVLNIQGRG PIEFSRNRAE 

351 KGGAIFIGPS VGDPAKQTST LTILASEGDI AFQGNMLNTK PGIRNAITVE 

401 AGGEIVSLSA QGGSRLVFYD PITHSLPTTS PSNKDI TINA NGASGSWFT 

451 SKGLSSTELL L P ANTTT ILL GTVK IAS GEL K I TDNAWNV LGFATQGSGQ 

501 LTLGSGGTLG LATPTGAPAA VDFT I GKLAF DPFSFLKRDF VSASVNAGTK 

551 NVTLTGALVL DEHDVTDLYD MVSLQTPVAI PIAVFKGATV TKTGFPDGEI 

601 ATP SHYGYQG KWSYTWSRPL LIPAPDGGFP GGPSPSANTL YAVWNSDTLV 

651 RSTYILDPER YGEIVSNSLW- ISFLGNQAFS DILQDVLLID HPGLSITAKA 

701 LGAYVEHT PR QGHEGFSGRY GGYQAALSMN YTDHTTLGLS FGQLYGKTNA 

751 NPYDSRCSEQ MYLLSFFGQF PIVTQKSEAL I S WKAAYGY S KNHLNTTYLR 

801 PDKAPKSQGQ WHNNSYYVLI SAEHPFLNWC LLTRPLAQAW DLSGFISAEF 

851 LGGWQSKFTE TGDLQRSFSR GKGYNVSLPI GCSSQWFTPF KKAPSTLTIK 

901 LAYKPDIYRV NPHNIVTWS NQESTSISGA NLRRHGLFVQ I HDWDLTED 

951 TQAFLNYTFD GKNGFTNHRV STGLKSTF* 

A predicted signal peptide is highlighted. 



The cp6737 nucleotide sequence <SEQ ID 

1 ATGCCTCTTT CTTTCAAATC 

51 TAGTGCAAGT TGCGCGTTTG 

101 CTCCAATTAC GAATCAGGGT 

151 TGTTCAAACT TCTTGGGGGC 

201 CAGCAATCTC TCCTTATTAG 

251 GTCAAGCTCC TACAAATAGT 

301 CTGACCTTCA AGAATTTTTC 

351 AGGACTTGGC GGCCTCATCT 

401 AAGATTTGAT CTTCACTACG 

451 ACTACGTCGG CAACTCCCGC 

501 TCTCCAACCT ACAGACTCAC 

551 AGTTTTTTGG GAACCTTGCC 

601 ACGGCAGTCG TTAAATTCAT 

651 TAACTTTACT TCGTCAGGAG 

701 TTTTTGAAAA CAATTCTGGA 

751 AACAGCTTAA AAGGCGTCAC 

801 TGGCGGAGCC ATCTGCATCC 

851 AGGGGAAGTG CACCTTCTCT 

901 ATCTACGCCG AAACCTGCAA 

951 AGATAGCAAC ACTGCAGCGA 

1001 TCAATATTCA AGGACGCGGT 

1051 AAGGGTGGAG CTATTTTCAT 

1101 AACATCGACA CTTACGATTT 

1151 GAAACATGCT CAATACAAAA 

1201 GCAGGGGGAG AGATTGTGTC 

1251 ATTTTATGAT CCCATTACAC 

1301 AAGACATTAC AATCAACGCT 

1351 AGTAAGGGAC TCTCCTCTAC 

1401 TATACTTCTA GGAACAGTCA 

1451 ACAATGCGGT TGTCAATGTT 

1501 CTTACCCTGG GCTCTGGAGG 

1551 ACCTGCCGCT GTAGACTTTA 

1601 CCTTCCTAAA AAGAGATTTT 

1651 AACGTCACTT TAACAGGAGC 



36>is: 

TTCATCTTTT TGTCTACTTG CCTGTTTATG 
CTGAGACTAG ACTCGGAGGG AACTTTGTTC 
GAAGAGATCT TACTCACTTC AGATTTTGTT 
GAGTTTTTCA AGTTCCTTTA TCAATAGTTC 
GGAAGGGCCT TTCCTTAACG TTTACCTCTT 
AACTATGCGC TACTTTCTGC CGCAGAGACT 
TTCTATAAAC TTTACAGGGA ACCAATCGAC 
ACGGAAAAGA TATTGTTTTC CAATCTATCA 
AACCGTGTTG CCTATTCTCC AGCATCTGTA 
AATCACTACA GTAACTACAG GAGCCTCTGC 
TCACTGTCGA AAACATATCC CAATCGATCA 
AACTTCGGCT CTGCAATTAG CAGTTCTCCC 
CAATAACACC GCTACCATGA GCTTCTCCCA 
GCGGCGTGAT TTATGGAGGA AGCTCTCTCC 
TGCATCATCT TCACCGCCAA CTCCTGTGTG 
CCCTTCATCA GGAACCTATG CTTTAGGAAG 
CTACGGGAAC TTTCGAATTA AAAAACAATC 
TATAATGGTA C AC CAAATGA TGCGGGTGCG 
CATCGTAGGG AACCAGGGTG CCTTGCTCCT 
GAAATGGCGG AGCCATCTGT GCTAAAGTGC 
CCTATTGAAT TCTCTAGAAA CCGCGCGGAG 
AGGCCCCTCT GTTGGAG AC C CTGCGAAGCA 
TGGCTTCCGA AGGTGATATT GCGTTCCAAG 
CCTGGAATCC GCAATGCCAT CACTGTAGAA 
TCTATCTGCA CAAGGAGGCT CACGTCTTGT 
ATAGCCTCCC AACCACAAGT CCGTCTAATA 
AATGGCGCTT CAGGATCTGT AGTCTTTACA 
AGAACTCCTG TTGCCTGCCA ACACGACAAC 
AGATCG C TAG TGGAGAACTG AAGATTACTG 
CTTGGCTTCG CT AC TCAGGG CTCAGGTCAG 
AACCTTAGGG CTGGCAACAC CCACGGGAGC 
CGATTGGAAA GTTAGCATTC GATCCTTTTT 
GTTTC AG CAT CAGTAAATGC AGGCACAAAA 
TCTGGTTCTT GATGAACATG ACGTTACAGA 
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1701 
1751 
1801 
1851 
1901 
1951 
2001 
2051 
2101 
2151 
2201 
2251 
2301 
2351 
2401 
2451 
2501 
2551 
2601 
2651 
2701 
2751 
2801 
2851 
2901 



TCTTTATGAT 
TTTTCAAAGG 
GCGACTCCAA 
CCGTCCCCTG 
CTCCTAGCGC 
CGTTCTACCT 
CAGCTTATGG 
AAGATGTTCT 
TTAGGAGCCT 
AGGTCGCTAT 
ACACTACGTT 
AACCCCTACG 
TGGTCAATTC 
AAGCAGCTTA 
CCTGACAAAG 
TGTTCTTATT 
GACCTCTGGC 
CTAGGTGGTT 
CTTTAGTAGA 
CTCAATGGTT 
CTTGCCTACA 
TGTCGTCTCA 
GCCACGGTTT 
ACTCAGGCCT 
C C AC CGAGTG 



ATGGTGTCAT 
AGCAACCGTT 
GCCACTACGG 
TTAATTCCAG 
AAATACTCTC 
ATATCTTAGA 
ATTTCCTTCT 
TTTGATAGAT 
ATGTCGAACA 
GGAGGCTACC 
AGGACTTTCT 
ATTCACGTTG 
CCTATCGTGA 
TGGTTATTCC 
CTCCAAAATC 
TCTGCAGAAC 
TCAAGCTTGG 
GGCAAAGTAA 
GGTAAAGGGT 
CACACCATTT 
AGCCTGATAT 
AACCAAGAGA 
GTTTGTACAA 
TTCTAAACTA 
TCTACAGGAC 



TACAAACTCC 
ACTAAGACAG 
CTAC CAAGG A 
CTCCTGATGG 
TATGCTGTAT 
TCCCGAGCGT 
TAGGAAATCA 
CATCCCGGGT 
CACACCAAGA 
AAGCTGCGCT 
TTCGGGCAGC 
CTCAGAACAA 
CTCAAAAGAG 
AAAAATCACC 
TCAAGGGCAA 
ATCCTTTCCT 
GATCTTTCAG 
GTTCACAGAA 
ACAATGTTTC 
AAGAAGGCTC 
CTATCGTGTC 
GCACTTCGAT 
ATCCATGATG 
TACCTTTGAC 
TAAAATCCAC 



AGTAGCAATT 
GATTTCCTGA 
AAGTGGTCCT 
AGGATTTCCT 
GGAATTCAGA 
TACGGAGAAA 
GGCATTCTCT 
TGTCCATAAC 
CAAGGACATG 
ATCTATGAAC 
TTTATGGAAA 
ATGTATTTAC 
CGAGGCCTTA 
TAAATAC C AC 
TGGCATAACA 
AAACTGGTGT 
GTTTTATTTC 
ACTGGAGATC 
CCTACCGATA 
CTTCT AC AC T 
AACCCTCACA 
CTCAGGAGCA 
TAGTAGATCT 
GGGAAAAATG 
ATTTTAA 



CCTATCGCTG 
TGGGGAGATT 
ACACATGGTC 
GGAGGTCCCT 
CACTCTCGTG 
TTGTCAGCAA 
GAT ATTCTC C 
CGCGAAAGCT 
AGGGCTTTTC 
TACACGGACC 
AACTAACGCC 
TCTCGTTCTT 
ATTTCCTGGA 
CTACCTCAGA 
ATAGTTACTA 
CTTCTTACAA 
CGCAGAATTC 
TGCAACGTAG 
GGATGTTCTT 
GACCATCAAA 
ATATTGTGAC 
AATCTACGCC 
CACCGAGGAC 
GATTTACAAA 



The PSORT algorithm predicts an outer membrane location (0.940). 

The protein was expressed in E.coli and purified as a GST-fusion product, as shown in Figure 18A. 
The recombinant protein was used to immunise mice, whose sera were used in an immunoblot 
analysis blot (Figure 18B) and for FACS analysis (Figure 18C). A his-tagged protein was also 
expressed. 

The cp6737 protein was also identified in the 2D-PAGE experiment (Cpn0454) and showed good 
cross-reactivity with human sera, including sera from patients with pneumonitis. 

These experiments show that cp6737 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 



Example 19 

The following C.pneumoniae protein (pid 4377090) was expressed <SEQ ID 37; cp7090>: 

1 MNIHSIiWKLC TlilAUALPA CSLSPNYGWE DSCNTCHHTR RKKPSSFGFV 

51 PLYTEEDFNP NFTFGEYDSK EEKQYKSSQV AAFRNITFAT DSYTIKGEEN 

101 liAlLTNIjVHY MKKNPKATLY IEGHTDERGA ASYNIiALGAR RANAIKEHLR 

151 KQGISADRLS TISYGKEHPL NSGHNELAWQ QNRRTEFKIH AR* 

A predicted signal peptide is highlighted. 



The cp7090 nucleotide sequence <SEQ ID 38> is: 

1 ATGAATATAC ATTCCCTATG GAAACTTTGT ACTTTATTGG CTTTACTTGC 

51 ATTGCCAGCA TGTAGCCTTT CCCCTAATTA TGGCTGGGAG GATTCCTGTA 

101 ATACATGCCA TCATACAAGA CGAAAAAAGC CTTCTTCTTT TGGCTTTGTT 

151 CCTCTCTATA CCGAAGAGGA CTTTAACCCT AATTTTACCT TCGGTGAGTA 

201 TGATTCCAAA GAAGAAAAAC AATACAAGTC AAGCCAAGTT GCAGCATTTC 

251 GTAATATCAC CTTTGCTACA GACAGCTATA CAATTAAAGG TGAAGAGAAC 

301 CTTGCGATTC TCACGAACTT GGTTC AC T AC ATGAAGAAAA ACCCGAAAGC 

351 TACACTGTAC ATTGAAGGGC ATACTGACGA GCGTGGAGCT GCATCCTATA 

401 ACCTTGCTTT AGGAGCACGA CGAGCCAATG C GATT AAAG A GCATCTCCGA 

451 AAGCAGGGAA TCTCTGCAGA TCGTCTATCT ACTATTTCCT ACGGAAAAGA 
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501 ACATCCTTTA AATTCGGGAC ACAACGAACT AGCATGGCAA CAAAATCGCC 
551 GTACAGAGTT TAAGATTCAT GCACGCTAA 

The PSORT algorithm predicts an outer membrane location (0.790). 

The protein was expressed in E,coli and purified as a GST-fusion product, as shown in Figure 19 A. 
A his-tagged protein was also expressed. The recombinant proteins were used to immunise mice, 
whose sera were used in a Western blot (Figure 19B) and for FACS analysis. 

These experiments show that cp7090 is useful immunogen. These properties are not evident from the 
sequence alone. 

Example 20 

The following C.pneumoniae protein (pid 4377091) was expressed <SEQ ID 39; cp7091>: 

1 MLRQLCFQVF FFCFASLVYA EELEWVRSE HITLPIEVSC QTDTKDPKIQ 

51 KYLSSLTEIF CKDIALGDCL QPTAASKESS SPLAISLRLH VPQLSWLLQ 

101 SSKTPQTLCS FTISQNLSVD RQKIHHAADT VHYALTGIPG ISAGKIVFAL 

151 SSLGKDQKLK QGELWTTDYD GKNIiAPLTTE CSLSITPKWV GVGSNFPYLY 

201 VSYKYGVPKI FLGSLENTEG KKVLPDKGNQ LMPTFSPRKK LLAFVADTYG 

251 NPDLFIQPFS LTSGPMGRPR RLLNENFGTQ GNPSFNPEGS QLVFISNKDG 

3 01 RPRLYIMSLD PEPQAPRLLT KKYRNS SC PA WSPDGKKIAF CSVIKGVRQI 

351 CIYDLSSGED YQLTTSPTNK ESPSWAIDSR HLVFSAGNAE ESELYLISLV 

401 TKKTNK I AIG VGEKRFPSWG AFPQQPIKRT L* 

A predicted signal peptide is highlighted. 

The cp7091 nucleotide sequence <SEQ ID 40> is: 

1 ATGTTACGGC AACTATGCTT CCAAGTTTTT TTCTTTTGCT TCGCATCGCT 

51 AGTCTATGCT GAAGAATTAG AAGTTGTTGT CCGTTCCGAA CATATCACGC 

101 TCCCTATTGA GGTCTCTTGC CAGACCGATA CGAAAGATCC AAAAATACAG • ■ 

151 AAATACCTCA GCTCGCTAAC GGAGATATTT TGCAAGGACA TTGCCCTAGG 

201 AGATTGTCTA CAACCCACAG CGGCTTCTAA AGAATCGTCA TCTCCTTTAG 

251 CAATATCTTT ACGGTTGCAT GTACCTCAGC TATCTGTAGT GCTTTTACAG 

301 TCTTCAAAAA CTCCTCAAAC CTTATGTTCT TTTACTATTT CTCAAAATCT 

351 TTCTGTAGAT CGTCAAAAAA TCCATCACGC TGCTGATACA GTTCATTACG 

401 CCCTCACAGG GATTCCTGGA ATCAGTGCTG GGAAAATTGT TTTTGCTCTA 

451 AGTTCTTTAG GAAAAGATCA AAAGCTCAAG CAAGGAGAAT TATGGACTAC 

501 AGATTACGAT GGGAAAAACC TCGCCCCTTT AACCACAGAA TGTTCGCTCT 

551 CTATAACTCC AAAATGGGTG GGTGTGGGAT CAAATTTTCC CTATCTCTAT 

601 GTTTCGTATA AGTATGGTGT GCCTAAAATT TTTCTTGGTT CCCTAGAGAA 

651 CACTGAAGGT AAAAAAGTCC TTCCGTTAAA AGGCAACCAA CTCATGCCTA 

701 CGTTTTCTCC AAGAAAAAAG CTTTTAGCTT TCGTTGCTGA TACGTATGGA 

751 AATCCTGATT TATTTATTCA ACCGTTCTCA CTAACTTCAG GACCTATGGG 

801 TCGCCCACGT CGCCTCCTTA ATGAGAATTT CGGGACTCAA GGGAATCCCT 

851 CCTTCAACCC TGAAGGATCC CAGCTTGTCT TTATATCGAA CAAAGACGGC 

901 CGTCCGCGTC TTTATATTAT GTCCCTCGAT CCTGAACCCC AAGCACCTCG 

951 CTTGCTGACA AAAAAATACA GAAATAGCAG TTGCCCTGCA TGGTCTCCAG 

1001 ATGGTAAAAA AATAGCCTTC TGCTCTGTAA TTAAAGGGGT GCGACAAATT 

1051 TGTATTTACG ATCTCTCCTC TGGAGAGGAT TACCAACTCA CTACGTCTCC 

1101 CACAAATAAA GAGAGTCCTT CTTGGGCTAT AGACAGCCGT CATCTTGTCT 

1151 TTAGTGCGGG GAATGCTGAA GAATCAGAGT TATATTTAAT CAGTCTAGTC 

1201 ACCAAAAAAA CTAACAAAAT TGCTATAGGA GTAGGAGAAA AACGGTTCCC 

1251 CTCCTGGGGT GCTTTCCCTC AGCAACCGAT AAAGAGAACA CTATGA 

The PSORT algorithm predicts an inner membrane location (0.109). 

The protein was expressed in Rcoli and purified as a GST-fusion product, as shown in Figure 20A. 
A his-tagged protein was also expressed. The recombinant proteins were used to immunise mice, 
whose sera were used in a Western blot (Figure 20B) and for FACS analysis. 
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These experiments show that cp7091 is a useful immunogen. These properties are not evident from 
the sequence alone. 

Example 21 

The following C.pneumoniae protein (pid 4376260) was expressed <SEQ ID 41; cp6260>: 

1 MRFSLCGFPL VFSFTLLSVF DTSLSA TTIS LTPEDSFHGD SQNAERSYNV 

51 QAGDVYSLTG DVSISNVDNS ALNKACFNVT SGSVTFAGNH HGLYFNNISS 

101 GTTKEGAVLC CQDPQATARF SGFSTLSFIQ SPGDIKEQGC LYSKNALMLL 

151 NNYWRFEQN QSKTKGGAIS GANVTIVGNY DSVSFYQNAA TFGGAIHSSG 

201 PLQIAVNQAE IRFAQNTAKN GSGGALYSDG DIDIDQNAYV LFRENEALTT 

251 A IGKGG AVCC LPTSGSSTPV PIVTFSDNKQ LVFERNHSIM GGGAIYARKL 

301 SISSGGPTLF INNISYANSQ NLGGAIAIDT GGEISLSAEK GTITFQGNRT 

351 SLPFIiNGIHIj LQNAKFLKLQ ARNGYS I EFY DP ITSEADGS TQLN INGDPK 

401 NKEYTGTILF SGEKSLANDP RDFKSTIPQN VNLSAGYLVT KEGAEVTVSK 

451 FTQSPGSHIiV LDLGTKLIAS KEDIAITGLA IDIDSLSSSS TAAV I KANT A 

501 NKQISVTDSI ELISPTGNAY EDLRMRNSQT F PL L SLE PGA GGSVTVTAGD 

551 FLPVS PHYGF QGNWKLAWTG TCNKVGEFFW DKINYKPRPE KEGNLVPNIL 

601 WGNAVDVRSL MQVQETHASS LQTDRGLWID G IGNFFHVS A SEDNIRYRHN 

651 SGGYVX.SVNN EITPKHYTSM AFSQLFSRDK DYAVSNNEYR MYLG S YTjYQY 

701 TTSLGNIFRY ASRNPNVNVG ILSRRFIiQNP LMIFHFLCAY GHATNDMKTD 

751 YANFPMVKNS WRNNCWAIEC GGSMPLLVFE NGRLFQGAIP FMKLQLVYAY 

801 QGDFKETTAD GRRFSNG SLT SISVPLGIRF EKLALSQDVL YDFSFSYIPD 

851 IFRKDPSCEA ALVISGDSWL VPAAHVSRHA FVG SGTGRYH FNDYTELLCR 

901 GSIECRPHAR NYNINCGSKF RF* 

A predicted signal peptide is highlighted. 

The cp6260 nucleotide sequence <SEQ ID 42> is: 

1 ATGC GATTTT CGCTCTGCGG ATTTCCTCTA GTTTTTTCTT TTACATTGCT 

51 CTCAGTCTTC GACACTTCTT TGAGTGCTAC TACGATTTCT TTAACCCCAG 

101 AAGATAGTTT TCATGGAGAT AGTCAGAATG CAGAACGTTC TTATAATGTT 

151 CAAGCTGGGG ATGTCTATAG CCTTACTGGT GATGTCTCAA TATCTAACGT 

201 C GAT AAC TC T GCATTAAATA AAGCCTGCTT CAATGTGACC TCAGGAAGTG 

251 TGACGTTCGC AGGAAATCAT CATGGGTTAT ATTTTAATAA TATTTCCTCA 

301 GGAACTACAA AGGAAGGGGC TGTACTTTGT TGCCAAGATC CTCAAGCAAC 

351 GGCACGTTTT TCTGGGTTCT CCACGCTCTC TTTTATTCAG AGCC CCGGAG 

401 ATATTAAAGA ACAGGGATGT CTCTATTCAA AAAATGCACT TATGCTCTTA 

451 AACAATTATG TAGTGCGTTT TGAACAAAAC CAAAGTAAGA CTAAAGGCGG 

501 AGCTATTAGT GGGGCGAATG TTACTATAGT AGGCAACTAC GATTCCGTCT 

551 CTTTCTATCA GAATGCAGCC ACTTTTGGAG GTGCTATCCA TTCTTCAGGT 

601 CCCCTACAGA TTGCAGTAAA TCAGGCAGAG ATAAGATTTG CACAAAATAC 

651 TGCCAAGAAT GGTTCTGGAG GGGCTTTGTA CTC CGATGGT GATATTGATA 

701 TTGATCAGAA TGCTTATGTT CTATTTCGAG AAAATGAGGC ATTGACTACT 

751 GCTATAGGTA AGGGAGGGGC TGTCTGTTGT CTTCCCACTT CAGGAAGTAG 

801 TACTCCAGTT CCTATTGTGA CTTTCTCTGA CAATAAACAG TTAGTCTTTG 

851 AAAGAAACCA TTCCATAATG GGTGGCGGAG CCATTTATGC TAGGAAACTT 

901 AGCATCTCTT CAGGAGGTCC TACTCTATTT ATCAATAATA TATCATATGC 

951 AAATTCGCAA AATTTAGGTG GAGCTATTGC CATTGATACT GGAGGGGAGA 

1001 TCAGTTTATC AGCAGAGAAA GGAACAATTA CATTCCAAGG AAAC CGGACG 

1051 AGCTTACCGT TTTTGAATGG CATCCATCTT TTACAAAATG CTAAATTCCT 

1101 GAAATTACAG GCGAGAAATG GATACTCTAT AGAATTTTAT GATCCTATTA 

1151 CTTCTGAAGC AGATGGGTCT ACCCAATTGA ATATCAACGG AGATCCTAAA 

1201 AATAAAGAGT ACACAGGGAC CATACTCTTT TCTGGAGAAA AGAGTCTAGC 

1251 AAACGATCCT AGGGATTTTA AATCTACAAT CCCTCAGAAC GTCAACCTGT 

1301 CTGCAGGATA CTTAGTTATT AAAGAGGGGG CCGAAGTCAC AGTTTCAAAA 

1351 TTCACGCAGT CTCCAGGATC GCATTTAGTT TTAGATTTAG G AAC C AAAC T 

1401 GATAGCCTCT AAGGAAGACA TTGCCATCAC AGGCCTCGCG ATAGATATAG 

1451 ATAGCTTAAG CTCATCCTCA ACAGCAGCTG TTATTAAAGC AAACAC CGC A 

1501 AATAAACAGA TATCCGTGAC GGACTCTATA GAACTTATCT CGCCTACTGG 

1551 CAATGCCTAT GAAGATCTCA GAATGAGAAA TTCACAGACG TTCCCTCTGC 

1601 TCTCTTTAGA GCCTGGAGCC GGGGGTAGTG TGACTGTAAC TGCTGGAGAT 

1651 TTCCTACCGG TAAGTCCCCA TTATGGTTTT CAAGGCAATT GG AAATTAGC 

1701 TTGGACAGGA AC TGG AAAC A AAGTTGGAGA ATTCTTCTGG GATAAAATAA 
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1751 
1801 
1851 
1901 
1951 
2001 
2051 
2101 
2151 
2201 
2251 
2301 
2351 
2401 
2451 
2501 
2551 
2601 
2651 
2701 
2751 



ATTATAAGCC 
TGGGGGAATG 
TGCATCGAGC 
ATTTCTTCCA 
AGCGGTGGAT 
TACTTCGATG 
TTTCCAACAA 
ACAACCTCCC 
AAACGTCGGG 
TTCATTTTTT 
TACGCAAATT 
TATAGAGTGC 
TTTTCCAAGG 
CAGGGAGATT 
GAGTTTAACA 
CACTTTCTCA 
ATTTTCCGTA 
CTCCTGGCTT 
GTGGAACGGG 
GGAAGTATAG 
AAGCAAATTT 



TAG AC C TG AA 
CTGTAGATGT 
TTACAGACAG 
TGTATCTGCC 
ATGTTCTATC 
GCATTTTCCC 
CGAATACAGA 
TAGGGAATAT 
ATTCTCTCAA 
GTGTGCTTAT 
TCCCTATGGT 
GGAGGGAGCA 
TGCCATCCCA 
TCAAAGAGAC 
TCGATTTCTG 
GGATGTACTC 
AGGATCCCTC 
GTTCCGGCAG 
TCGGTATCAC 
AATGCCGCCC 
CGTTTTTAG 



AAAGAAGGAA 
CAGATCCTTA 
ATCGAGGGCT 
TCCGAAGACA 
TGTAAATAAT 
AACTCTTTAG 
ATGTATTTAG 
TTTCCGTTAT 
GAAGGTTTCT 
GGTCATGCCA 
GAAAAACAGC 
TGCCTCTATT 
TTTATGAAAC 
GACTGCAGAT 
TACCTCTAGG 
TATGACTTTA 
ATGTGAAGCT 
CACACGTATC 
TTTAACGACT 
CCATGCTAGG 



ATTTAGTTCC 
ATGCAGGTTC 
GTGGATCGAT 
ATATAAGGTA 
GAGATCACAC 
TAGAGACAAG 
GATCGTATCT 
GCTTCGCGTA 
TCAAAATCCT 
CCAATGATAT 
TGGAGAAACA 
GGTATTTGAG 
TACAATTAGT 
GGCCGTAGAT 
CATACGCTTT 
GTTTCTCCTA 
GCTCTGGTGA 
AAGACATGCT 
ATACTGAGCT 
AATTATAATA 



TAATATCTTG 
AAGAGACCCA 
GGAATTGGGA 
CCGTCATAAC 
CTAAGCACTA 
GACTATGCGG 
CTATCAATAT 
ACCCTAATGT 
CTTATGATTT 
GAAAACAGAC 
ATTGTTGGGC 
AACGGAAGAC 
TTATGCTTAT 
TTAGTAATGG 
GAGAAGCTGG 
TATTC CTGAT 
TTAGCGGAGA 
TTTGTAGGGA 
CTTATGTCGA 
TAAACTGTGG 



The PSORT algorithm predicts an outer membrane location (0.921). 

The protein was expressed in E.coli and purified both as a his-tag and GST-fusion product. The GST- 
fusion is shown in Figure 21A. This recombinant protein was used to immunise mice, whose sera 
were used in a Western blot (Figure 21B) and for FACS analysis (Figure 21C). 

This protein also showed good cross-reactivity with human sera, including sera from patients with 
pneumonitis. 

These experiments show that cp6260 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 



Example 22 

The following C.pneumoniae protein (pid 4376456) was expressed <SEQ ID 43; cp6456>: 

1 MSSPVNNTPS APNIPIPAPT TPGIPTTKPR SSFIEKVIIV AKYILFAIAA 

51 TSGALGTILG LSGALTPGIG IALLVIFFVS MVLLGLILKD SISGGEERRL 

101 REEVSRFTSE NQRLTVITTT LETEVKDLKA AKDQLTLEIE AFRNENGNIiK 

151 TTAEDLEEQV SKLSEQLEAL ERINQLIQAN AGDAQEISSE LKKL I SGWDS 

201 KWEQINTSI QALKVLLGQB WVQEAQTHVK AMQEQIQALQ AEILGMHNQS 

251 TALQKSVENIi LVQDQAI/TRV VGKT.LESENK LSQACSALRQ E I EKLAQHET 

3 01 SLQQRIDAML AQEQNLAEQV TAIiEKMKQEA QKAESEFIAC VRDRTFGRRE 

351 TPPPTTPWE GDESQEEDEG GTPPVSQPSS PVDRATGDGQ * 

The cp6456 nucleotide sequence <SEQ ID 44> is: 

1 ATGTCATCTC CTGTAAATAA CACACCCTCA GCACCAAACA TTCCAATACC 

51 AGCGCCCACG ACTCCAGGTA TTCCTACAAC AAAACCTCGT TCTAGTTTCA 

101 TTGAAAAGGT TATCATTGTA GCTAAGTACA TACTATTTGC AATTGCAGCC 

151 ACATCAGGAG CACTCGGAAC AATTCTAGGT CTATCTGGAG CGCTAACCCC 

201 AGGAATAGGT ATTGCCCTTC TTGTTATCTT CTTTGTTTCT ATGGTGCTTT 

251 TAGGTTTAAT CCTTAAAGAT TCTATAAGTG GAGGAGAAGA ACGCAGGCTC 

301 AGAGAAGAGG TCTCTCGATT TACAAGTGAG AATCAACGGT TGACAGTCAT 

351 AACCACAACA CTTGAGACTG AAGTAAAGGA TTTAAAAGCA GCTAAAGATC 

401 AACTTACACT TGA7VATCGAA GCATTTAGAA ATGAAAACGG TAATTTAAAA 

451 ACAACTGCTG AGGACTTAGA AGAGCAGGTT TCTAAACTTA GCGAACAATT 

501 AGAAGCACTA GAGCGAATTA ATCAACTTAT CCAAGCAAAC GCTGGAGATG 

551 CTCAAGAAAT TTCGTCTGAA CTAAAGAAAT TAATAAGCGG TTGGGATTCC 

601 AAAGTTGTTG AACAGATAAA TACTTCTATT CAAGCATTGA AAGTGTTATT 

651 GGGTCAAGAG TGGGTGCAAG AGGCTCAAAC ACACGTTAAA GCAATGCAAG 

701 AGCAAATTCA AGCATTGCAA GCTGAAATTC TAGGAATGCA CAATCAATCT 
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751 ACAGCATTGC AAAAGTCAGT TGAGAATCTA TTAGTACAAG ATCAAGCTCT 

801 AACAAGAGTA GTAGGTGAGT TGTTAGAGTC TGAGAACAAG CTAAGCCAAG 

851 CTTGTTCTGC GCTACGTCAA GAAATAGAAA AGTTGGCCCA ACATGAAACA 

901 TCTTTGCAAC AACGTATTGA TGCGATGCTA GCCCAAGAGC AAAATTTGGC 

951 AGAGCAGGTC ACAGCCCTTG AAAAAATGAA ACAAGAAGCT CAGAAGGCTG 

1001 AGTCCGAGTT CATTGCTTGT GTACGTGATC GAACTTTCGG ACGTCGTGAA 

1051 ACACCTCCAC CAACAACACC TGTAGTTGAA GGTGATGAAA GTCAAGAAGA 

1101 AGACGAAGGA GGTACTCCCC CAGTATCACA ACCATCTTCA CCCGTAGATA 

1151 GAGCAACAGG AGATGGTCAG TAA 

The PSORT algorithm predicts inner membrane (0.127). 

The protein was expressed in E.coli and purified as a GST-fusion product, as shown in Figure 22A. 
The recombinant protein was used to immunise mice, whose sera were used in a Western blot 
(Figure 22B) and for FACS analysis (Figure 22C). A his-tag protein was also expressed. 

These experiments show that cp6456 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 23 

The following C.pneumoniae protein (pid 4376729) was expressed <SEQ ID 45; cp6729>: 

1 MKIPLHKLLI SSTIiVTPIIiL S I AT YG A DAS LSPTDSFDGA GGSTFTPKST 

51 ADANGTNYVTj SGNVYINDAG KGTALTGCCF TETTGDLTFT GKGYSFSFNT 

101 VDAGSNAGAA ASTTADKALT FTGFSNIiSFI AAPGTTVASG KSTIi S S AGAI> 

151 NLTDNGTILF SQNVSNEANN NGGAI TTKTL SISGNTSSIT FTSNSAKKLG 

201 GAIYSSAAAS ISGNTGQLVF MNNKGETGGG ALGFEASSSI TQNSSLFFSG 

251 NTATDAAGKG GAIYCEKTGE TPTLTISGNK SLTFAENSSV TQGGAICAHG 

301 LDLSAAGPTL FSNNRCGNTA AGKGGAIAIA DSGSLSLSAN QGDITFLGNT 

351 LTSTSAPTST RNAIYLGSSA KITNbRAAQG QS I YFYDPIA SNTTGASDVL 

401 TINQPDSNSP LDYSGTIVFS G E KL S ADEAK AADNFTSILK QPLALASGTL 

451 ALKGNVELDV NGFTQTEGST LLMQPGTKLK ADTEAISI/TK LWDItSALEG. 

501 NKSVSIETAG ANKTITLTSP LVFQDSSGNF YESHTINQAF TQPLWFTAA 

551 TAASDIYIDA LLTSPVQTPE PHYGYQGHWE ATWADTSTAK SGTMTWVTTG 

601 YNPNPERRAS WPDSLWASF TDIRTLQQIM TSQANSIYQQ RGLWASGTAN 

651 FFHKDKSGTN QAFRHKSYGY IVGGSAEDFS ENIFSVAFCQ LFGKDKDLFI 

701 VENTSHNYIiA SLYLQHRAFL GGLPMPSFGS ITDMLKDIPL ILNAQLSYSY 

751 TKNDMDTRYT SYPEAQGSWT NNSGALELGG SLALYLPKEA PFFQGYF PFL 

801 KFQAVYSRQQ NFKESGAEAR AFDDGDL.VNC SIPVGIRLEK ISEDEKNNFE 

851 ISLAYIGDVY RKNPRSRTSL MVSGASWTSL CKNLARQAFL ASAGSHLTLS 

901 PHVELSGEAA YELRGSAHIY NVDCGLRYSF * 

A predicted signal peptide is highlighted. 



The cp6729 nucleotide sequence <SEQ ID 46> is: 

1 ATGAAAATAC CCTTGCACAA ACTCCTGATC TCTTCGACTC TTGTCACTCC 

51 CATTCTATTG AGCATTGCAA CTT AC GGAGC AGATGCTTCT TTATCCCCTA 

101 CAGATAGCTT TGATGGAGCG GGCGGCTCTA CATTTACTCC AAAATCTACA 

151 GCAGATGCCA ATGGAACGAA CTATGTCTTA TCAGGAAATG TC TAT AT AAA 

201 CGATGCTGGG AAAGGCACAG CATTAACAGG CTGCTGCTTT ACAGAAACTA 

251 CGGGTGATCT GACATTTACT GGAAAGGGAT AC TCATTTTC ATTCAACACG 

301 GTAGATGCGG GTTCGAATGC AGGAGCTGCG GCAAGCACAA CTGCTGATAA 

351 AGCCCTAACA TTCACAG GAT TTTCTAACCT TTCCTTCATT GCAGCTCCTG 

401 GAACTACAGT TGCTTCAGGA AAAAGTACTT TAAGTTCTGC AGGAGCCTTA 

451 AATCTT AC CG ATAATGGAAC GATTCTCTTT AGCCAAAACG TCTCCAATGA 

501 AGCTAATAAC AATGGCGGAG CGATCACCAC AAAAACTCTT TCTATTTCTG 

551 GGAATAC CTC TTCTATAACC TTCACTAGTA ATAGCGCAAA AAAATT AG GT 

601 GGAGCGATCT ATAGCTCTGC GGCTGCAAGT ATTTCAGGAA ACACCGGCCA 

651 GTTAGTCTTT ATGAATAATA AAGGAGAAAC TGGGGGTGGG GCTCTGGGCT 

701 TTGAAGCCAG CTCCTCGATT ACTCAAAATA GCTCCCTTTT CTTCTCTGGA 

751 AACACTGCAA CAGATGCTGC AGGCAAGGGC GGGGCCATTT ATTGTGAAAA 

801 AACAGGAGAG ACTCCTACTC TTACTATCTC TGGAAATAAA AGTCTGACCT 

851 TCGCCGAGAA C TCTTC AG T A ACTCAAGGCG GAGCAATCTG TGCCCATGGT 



WO 02/02606 



PCT/IB01/01445 



-65- 



901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 
2051 
2101 
2151 
2201 
2251 
2301 
2351 
2401 
2451 
2501 
2551 
2601 
2651 
2701 
2751 



CTAGATCTTT 
GAACACAGCT 
CTTTAAGTCT 
CTAACCTCAA 
ATCGTCAGCA 
ATTTCTATGA 
ACCATCAACC 
TGTATTTTCT 
AC TTCAC ATC 
GCACTCAAAG 
AGGCTCTACA 
AAGCTATCAG 
AATAAGAGTG 
AACCTCTCCT 
ATACGATAAA 
ACTGCTGCTA 
AACTCCAGAA 
CAGACACATC 
TACAACCCTA 
GGCATCCTTT 
CGAATAGTAT 
TTCTTCCATA 
CTACGGCTAT 
TCAGTGTAGC 
GTTGAAAATA 
AGCATTCCTA 
TGCTGAAAGA 
ACTAAAAATG 
CTCTTGGACC 
TATATCTCCC 
AAGTTCCAGG 
TGAAGCCCGT 
TCGGCATTCG 
ATTTCTCTAG 
TACTTCTCTA 
TCGCACGACA 
CCTCATGTAG 
ACACATCTAC 



CCGCTGCTGG 
GCAGGCAAGG 
CTCTG C AAAT 
CCTCCGCGCC 
AAAATTACGA 
TCCGATTGCA 
AACCGGATAG 
GGGGAAAAGC 
TATATTAAAG 
GAAATGTCGA 
CTCCTCATGC 
TCTTACCAAA 
TGTCCATTGA 
CTTGTTTTCC 
CCAAGCCTTC 
GCGATATTTA 
CCTCATTACG 
AACTGCAAAA 
ATCCTGAGCG 
ACTGACATTC 
CTATCAGCAA 
AGGATAAATC 
ATTGTTGGAG 
TTTCTGCCAG 
CCTCTCATAA 
GGAGGACTTC 
TATTCCTCTC 
ATATGGATAC 
AATAACTCTG 
TAAAGAAGCA 
CAGTCTACAG 
GCTTTTGATG 
GTTAGAAAAA 
CCTACATTGG 
ATGGTCAGTG 
AGCCTTCTTA 
AACTCTCTGG 
AATGTAGATT 



CCCTACCCTA 
GCGGCGCTAT 
CAAGGAGACA 
AACATCGACA 
ACTTAAGGGC 
TCTAACACCA 
CAACTCGCCT 
TCTCTGCAGA 
CAACCATTGG 
GTTAGATGTC 
AACCAGGAAC 
CTTGTCGTTG 
AACAGCAGGA 
AAGATAGTAG 
ACGCAGCCTT 
TATCGATGCG 
GGTATCAGGG 
TCAGGAACTA 
TAGAGCTTCC 
GCACTCTACA 
CGAGGACTCT 
AGGAACTAAC 
GAAGTGC TGA 
CTCTTCGGTA 
CTATTTAGCG 
CCATGCCCTC 
ATTTTGAATG 
TCGCTATACT 
GGGCTCTAGA 
CCGTTCTTCC 
CCGCCAACAA 
ATGGAGACCT 
ATCTCCGAAG 
TGATGTGTAT 
GAGCCTCTTG 
GCAAGTGCTG 
GGAAGCTGCT 
GTGGGCTAAG 



TTTTCAAATA 

TGCAATTGCC 

TCACGTTCCT 

CGGAATGCTA 

AGCCCAAGGC 

CAGGAGCTTC 

TTAGATTATT 

TGAAGCGAAA 

CTCTAGCCTC 

AATGGTTTCA 

AAAGCTCAAA 

ATCTTTCTGC 

GCCAACAAAA 

CGGCAATTTT 

TGGTGGTATT 

CTTCTCACTT 

ACATTGGGAA 

TGACTTGGGT 

GTAGTTCCCG 

GCAGATCATG 

GGGCATCAGG 

CAAGCATTCC 
AGATTTTTCT 

AAGATAAAGA 
TCGCTATACC 
ATTTGGAAGT 
CCCAGCTAAG 
TCCTATCCTG 
GCTCGGAGGA 
AGGGATATTT 
AACTTTAAAG 
AGTGAACTGC 
ATGAAAAAAA 
CGTAAAAATC 
GACTTCGCTA 
GAAGCCATCT 
TATGAGCTTC 
ATACTCATTC 



ATAGATGCGG 
GACTCTGGAT 
TGGCAACACT 
TCTACCTGGG 
CAATCTATCT 
AG AC GTTCTG 
CAGGAACGAT 
GCTGCTGATA 
TGGAACCTTA 
CACAGACTGA 
GCAGATACTG 
CTTAGAGGGA 
CTATAACTCT 
TATGAAAGCC 
CACTGCTGCT 
CTCCAGTACA 
GCCACTTGGG 
AACTACGGGC 
ATTCATTATG 
ACATCTCAAG 
AACTGCGAAT 
GACATAAAAG 
GAAAATATCT 
CCTGTTTATA 
TGCAACATCG 
ATCACCGACA 
CTACAGCTAC 
AAGCTCAAGG 
TCTCTGGCTC 
CCCCTTCTTA 
AGAGTGGCGC 
TCTATCCCTG 
TAATTTCGAG 
CCCGTTCGCG 
TGTAAAAACC 
GACTCTCTCC 
GTGGCTCAGC 
TAG 



The PSORT algorithm predicts outer membrane (0.927). 



The protein was expressed in Exoli and purified as a GST-fusion product, as shown in Figure 23A. 
The recombinant protein was used to immunise mice, whose sera were used in a Western blot 
(Figure 23B) and for FACS analysis (Figure 23C). A his-tag protein was also expressed. 

The cp6729 protein was also identified in the 2D-PAGE experiment (Cpn0446) and showed good 
cross-reactivity with human sera, including sera from patients with pneumonitis. 

These experiments show that cp6729 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 



Example 24 

The following C.pnewnoniae protein (pid 4376849) was expressed <SEQ ID 47; cp6849>: 

1 MSKLIRRWT VIxALTSMASC FASGGIEAAV AESLITKIVA SAETKPAPVP 

51 MTAKKVRLVR RNKQPVEQKS RGAFCDKEFY PCEEGRCQPV EAQQESCYGR 

101 LYSVKVNDDC NVEICQSVPE YATVGSPYPI EILAIGKKDC VDWITQQLP 

151 CEAEFVSSDP ETTPTSDGKL, VWK I DRLGAG DKCKITVWVK PLKEGCCFTA 

201 ATVCAC PELR SYTKCGQPAI CIKQEGPDCA CLRCPVCYKI EWNTGSAIA 

251 RNVTVDNPVP DGYSHASGQR VLSFNLGDMR PGDKKVFTVE FCPQRRGQIT 

301 NVATVTYCGG HKCSANVTTV VNEPCVQVNI SGADWSYVCK PVEYSISVSN 

351 PGDLVLHDW IQDTLPSGVT VLEAPGGEIC CNKWWRIKE MCPGETLQFK 
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401 LWKAQVPGR FTNQVAVTSE SNCGTCTSCA ETTTHWKGLA ATHMCVLDTN 

451 DPICVGENTV YRICVTNRGS AEDTNVSLIL KFSKELQPIA SSGPTKGTIS 

501 GNTWFDALP KLGSKESVEF SVTLKG IAPG DARGEAILSS DTLTS PVSDT 

551 ENTHVY* 



A predicted signal peptide is highlighted. 

The cp6849 nucleotide sequence <SEQ ID 48> is: 



1 


ATGTCCAAAC 


51 


GGCGAGTTGC 


101 


TGATTACTAA 


151 


ATGACAGCGA 


201 


ACAAAAAAGC 


251 


AGGGACGATG 


301 


TTGTATTCTG 


351 


CGTTCCAGAA 


401 


CTATAGGCAA 


451 


TGCGAAGCTG 


501 


TGGGAAATTA 


551 


AAATT AC TGT 


601 


GCTACTGTAT 


651 


ACCAGCCATT 


701 


GCCCTGTATG 


751 


CGTAACGTAA 


801 


TGGTCAAAGA 


851 


AAAAGGTATT 


901 


AACGTTGCTA 


951 


AACTACAGTT 


1001 


ATTGGTCTTA 


1051 


CCTGGAGACT 


1101 


TGGTGTTACA 


1151 


TTGTTTGGCG 


1201 


CTTGTAGTGA 


1251 


AACTAGTGAG 


1301 


CACATTGGAA 


1351 


GATCCTATCT 


1401 


CCGTGGTTCT 


1451 


AAGAACTTCA 


1501 


GGTAATACCG 


1551 


TGTAGAGTTT 


1601 


GCGAAGCTAT 


1651 


GAAAATACCC 



TCATCAGACG 
TTTGCCAGCG 
GATCGTCGCT 
AGAAGGTTAG 
CGTGGTGCTT 
TCAACCTGTA 
TAAAAGTAAA 
TACGCTACTG 
AAAAGATTGT 
AATTCGTAAG 
GTCTGGAAAA 
ATGGGTAAAA 
GTGCTTGCCC 
TGTATTAAGC 
CTACAAAATC 
CTGTAGATAA 
GTTCTCTCTT 
TACAGTTGAG 
CTGTAACTTA 
GTTAATGAGC 
CGTATGTAAA 
TGGTTC TTC A 
GTACTCGAAG 
TATTAAAGAA 
AAGCTCAAGT 
TCTAACTGCG 
AGGTCTTGCA 
GTGTAGGAGA 
GCTGAAGATA 
GCCAATAGCT 
TTGTTTTCGA 
TCTGTTACCT 
TCTTTCTTCT 
ACGTGTATTA 



AGTAGTTACG 
GGGGTATAGA 
AGTGCGGAAA 
ACTTGTCCGT 
TTTGTGATAA 
GAGGCTCAGC 
CGATGATTGC 
TAGGATCTCC 
GTTGATGTTG 
CAGTGATCCA 
TCGATCGCCT 
CCTCTTAAAG 
AGAGCTCCGT 
AAGAAGGACC 
GAAGTAGTGA 
TCCTGTTCCC 
TTAACTTAGG 
TTCTGCCCTC 
CTGCGGTGGA 
CTTGTGTACA 
CCTGTGGAGT 
TGATGTCGTG 
CTCCTGGTGG 
ATGTGCCCAG 
TCCTGGAAGA 
GAACATGTAC 
GCTACCCATA 
AAATACTGTC 
CTAACGTATC 
TCTTCAGGTC 
CGCTTTACCT 
TGAAAGGTAT 
GATACACTGA 
A 



GTCCTTGCGC 
GGCCG C TGT A 
CAAAGCCAGC 
AGAAATAAAC 
AGAATTTTAT 
AAGAGTCTTG 
AACGTAGAAA 
TTACCCTATT 
TGATTACACA 
GAAACAACTC 
GGGTGCAGGA 
AAGGTTGCTG 
TCTTATACTA 
TGACTGTGCT 
ACACAGGATC 
GATGGCTATT 
AGACATGAGA 
AAAGAAGAGG 
CACAAATGTT 
AGTAAATATC 
ACTCTATCTC 
ATCCAAGATA 
AGAGATCTGC 
GAGAAACCCT 
TTCACAAATC 
ATCTTGCGCA 
TGTGCGTATT 
TATCGTATCT 
TTTAATCTTG 
CAACTAAAGG 
AAACTCGGTT 
TGCTCCCGGA 
CTTCACCAGT 



TAACGAGTAT 
GCAGAGTCTC 
ACCTGTTCCT 
AACCAGTTGA 
CCCTGTGAAG 
CTACGGAAGA 
TTTGCCAGTC 
GAAATCCTTG 
ACAGCTACCT 
CTACAAGTGA 
GATAAATGCA 
CTTCACAGCT 
AATGCGGTCA 
TGCCTAAGAT 
TGCTATTGCC 
CTCATGCATC 
CCTGGCGATA 
TCAAATCACT 
CTGCAAATGT 
TCTGGTGCTG 
AGTATCGAAT 
CACTCCCTTC 
TGTAATAAAG 
CCAGTTTAAA 
AAGTTGCAGT 
GAAACAACAA 
AGACACAAAT 
GTGTAACTAA 
AAGTTCTCAA 
AACGATTTCA 
CTAAGGAATC 
GATGCTCGCG 
ATCAGACACA 



The PSORT algorithm predicts periplasmic space (0.93). 

The protein was expressed in E.coli and purified as a GST-fusion product, as shown in Figure 24 A, 
and also as a his-tag protein. The recombinant proteins were used to immunise .mice, whose sera 
were used in a Western blot (Figure 24B) and for FACS analysis (Figure 24C). 

The cp6849 protein was also identified in the 2D-PAGE experiment (Cpn0557). 

These experiments show that cp6849 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 



Example 25 

The following C.pneumoniae protein (pid 4376273) was expressed <SEQ ID 49; cp6273>: 

1 MQLFHLTLFG LLIiCSLPISI* vakfpe svgh kilyistqst qqaiat ylea 

51 LDAYGDHDFF VLRKIGEDYI* KQSIHSSDPQ TRKSTIIGAG LAGSSEALDV 

101 LSQAMETADP LQQLLVLSAV SGHLGKTSDD LLFKALAS PY PVIRLEAAYR 

151 LANLKNTKVT DHLHSFIHKIi PEEIQCLSAA IFLRLETEES DAY I RDL»LAA 

201 KKSAIRSATA LQIGEYQQKR FLPTLRNLLT SASPQDQEAI LYALGKUCDG 
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251 QSYYNIKKQL QKPDVDVTLA 

301 YALRHLPSEI GIPIALPIFL 

351 ERLVQPHYNE TLALSFSKGR 

401 EQIIiTFLFRL PKEAYLPCIY 

451 FQAAKLPGEP IIRAYADLAI 

501 ENQRPHPSMP YLRYQVTPES 

551 GDAKNFPVLA GLLIKIVE* 



-67- 

AAQALIALGK 
KTKNSEAKLN 
TLQNWKRVNI 
KLIiASQKTQIi 
YNLTKDPEKK 
RTKLMLDILE 



EEDALPVIKK 
VALALLELGC 
IVPQDPQERE 
ATTAISFLSH 
RSLHDYAKKL 
TLATSKSSED 



QALEERPRAL 
DTPKLLEYIT 
RLLSTTRGLE 
TSHQBALDLJj 
IQETLLFVDT 
IRDLIQLMTE 



A predicted signal peptide is highlighted. 

The cp6273 nucleotide sequence <SEQ ID 50> is: 



10 



15 



20 



25 



30 



35 



40 



1 


ATGGGACTAT 


51 


CATTTCTCTT 


101 


ATATAAGTAC 


151 


CTAGATGCCT 


201 


AGACTATCTC 


251 


GCACCATCAT 


301 


CTCTCCCAAG 


351 


ATCGGCAGTC 


401 


AAGCTTTAGC 


451 


CTTGCTAATT 


501 


TCATAAGCTT 


551 


GCTTGGAGAC 


601 


AAGAAAAGCG 


651 


ACAAAAACGC 


701 


CTCAAGATCA 


751 


CAGAGCTACT 


801 


CACTTTAGCA 


851 


CTCTTCCCGT 


901 


TATGCCTTAC 


951 


GATATTCCTA 


1001 


CTC TCTTAGA 


1051 


GAAAGGCTTG 


1101 


TAAGGGGCGT 


1151 


AAGATCCCCA 


1201 


GAGCAGATCC 


1251 


CTGTATTTAT 


1301 


CGATTTCTTT 


1351 


TTCCAAGCTG 


1401 


TCTTGCTATT 


1451 


ATGATTATGC 


1501 


GAAAACCAAA 


1551 


CCCAGAAAGC 


1601 


CCTCGAAGTC 


1651 


GGAGATGCAA 


1701 


GGAGTAA 



TCCATCTAAC 

GTTGCTAAAT 

GCAATCTACA 

ACGGTGATCA 

AAGCAAAGCA 

TGGAGCAGGC 

CTATGGAAAC 

TCAGGACATC 

ATCTCCCTAT 

TGAAGAACAC 

C CCGAAG AAA 

TGAAGAATCT 

CGATTCGGAG 
TTTCTTCCGA 

AGAAGCTATT 
ACAATATAAA 
GCAGCTCAAG 
GATAAAAAAG 
GGCATCTACC 
AAAACTAAGA 
GTTAGGGTGT 
TCCAACCACA 
ACTTTACAAA 
GGAGAGGGAA 
TTACGTTTCT 
AAGCTTTTGG 
TTTAAGTCAC 
CGAAGCTTCC 
TATAATC TC A 
AAAAAAGCTA 
GACCCCATCC 
CGTACGAAGC 
TTC CGAAGAT 
AAAATTTCCC 



TCTCTTTGGA 
TCCCTGAGTC 
CAGCAGGCCT 
TGACTTCTTC 
TCCACTCCTC 
CTGGCGGGAT 
TGCAGACCCC 
TTGGGAAAAC 
CCTGTCATCC 
TAAAGTCATT 
TCCAATGCCT 
GATGCTTATA 
TGCCACAGCT 
C AC TTAGGAA 
CTTTATGCTT 
AAAGCAATTG 
CTTTAATTGC 
CAAGCACTTG 
CTCTGAGATA 
ACAGCGAAGC 
GACACCCCTA 
TTATAATGAG 
ATTGGAAGCG 
AGGTTGCTCT 
CTTCCGCCTA 
CGAGTCAGAA 
ACCTCACATC 
TGGAGAACCT 
CCAAAGATCC 
ATTCAGGAAA 
CAGCATGCCC 
TCATGTTGGA 
ATCCGTTTAT 
AGTC CTTGC A 



CTTTTATTGT 
TGTAGGTCAT 
TAGCAACATA 
GTTTTAAGAA 
AGATCCGCAA 
CTTCAGAAGC 
CTGCAGCAGC 
TTCTGACGAC 
GCTTAGAAGC 
GATCATCTAC 
ATCTGCGGCA 
TTCGGGATCT 
TTGCAGATCG 
TTTGCTAACG 
TAGGGAAGCT 
CAGAAGCCTG 
TTTGGGGAAA 
AGGAGCGGCC 
GGGATTC CGA 
CAAGTTGAAT 
AACTACTGGA 
ACTCTAGCCT 
GGTGAACATC 
CCACAACCCG 
CCTAAAGAAG 
AACTCAGCTT 
AGGAAGC CTT 
ATCATCCGCG 
TGAAAAAAAA 
CCTTGTTATT 
TATCTACGTT 
TATTCTAGAG 
TGATACAACT 
GGCTTACTCA 



GTAGTCTTCC 
AAGATCCTTT 
TCTGGAAGCT 
AAATCGGAGA 
ACTAGAAAAA 
CTTGGACGTG 
TACTGGTTTT 
TTACTGTTTA 
CGCCTATAGA 
ATTCTTTCAT 
AT ATTC CTAC 
CTTAGCTGCC 
GAGAATACCA 
AGTGCGTCTC 
TAAGGATGGT 
ATGTGGATGT 
GAAGAGGACG 
TCGAGCCCTG 
TTGCCCTGCC 
GTAGCTTTAG 
AT AC ATT AC C 
TGAGTTTCTC 
ATAGTCCCTC 
AGGTCTTGAA 
CTTACCTCCC 
GCCACTACTG 
AGATCTACTT 
CCTATGCAGA 
CGTTCTCTCC 
TGTGGACACG 
ATCAGGTCAC 
ACACTAGCCA 
GATGACGGAA 
TAAAAATTGT 



45 The PSORT algorithm predicts a periplasmic location (0.922). 

The protein was expressed in E.coli and purified as a his-tag product and as a GST-fusion product, as 
shown in Figure 25A. The recombinant GST-fusion was used to immunise mice, whose sera were 
used in a Western blot (Figure 25B) and for FACS analysis (Figure 25C). 

This protein also showed good cross-reactivity with human sera, including sera from patients with 
50 pneumonitis. 

These experiments show that cp6273 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 26 

The following C.pneumoniae protein (pid 4376735) was expressed <SEQ ID 51; cp6735>: 
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1 MTILRMFLTC SALFLALPAA AQWYLHESD GYNGA INNK S LEPKITCYPE 

51 GTSYIFLDDV RISNVKHDQE DAGVFINRSG NLFFMGNRCN FTFHNLMTEG 

101 FGAAISNRVG DTTLTLSNFS YIjAFTSAPLL PQGQGAIYSL GSVMIENSEE 

151 VTFCGNYSSW SGAAIYTPYL LGSKASRPSV NLSGNRYIiVF RDNVSQGYGG 

201 AI STHNI/TLT TRGPSCFENN HAYHDVNSNG GAIAIAPGGS ISISVKSGDL 

251 IFKGNTASQD GNTIHNSIHL QSGAQFKNIjR AVSESGVYFY DPISHSESHK 

301 ITDLVINAPE GKETYEGTIS FSGLCLDDHE VCAENLTSTI LQDVTLAGGT 

3 51 IiSLSDGVTIiQ LHSFKQEASS TLTMSPGTTI, LCSGDARVQN LHILIEDTDN 

401 FVPVRIRAED KDALVSLEKL KVAFEAYWSV YDFPQFKEAF TIPLLELLGP 

451 SFDSLLLGET TLERTQVTTE NDAVRGFWSL SWEEYPPSLD KDRRITPTKK 

501 TVFLTWNPEI TSTP* 

A predicted signal peptide is highlighted. 



The cp6735 nucleotide sequence <SEQ ID 52> is: 



1 


ATGACCATAC 


51 


CCCTGCAGCA 


101 


GTGCTATCAA 


151 


GGAACTTCTT 


201 


TGATCAAGAA 


251 


TCATGGGCAA 


301 


TTTGGCGCTG 


351 


TAATTTTTCT 


401 


AAGGAGCGAT 


451 


GTGACTTTCT 


501 


TCCCTACCTT 


551 


GGAACCGCTA 


601 


GCCATATCTA 


651 


TGAAAATAAT 


701 


CCATTGCTCC 


751 


ATCTTCAAAG 


801 


CATCCATCTG 


851 


AATCCGGAGT 


901 


ATTACAGATC 


951 


AACAATTAGC 


1001 


AAAATCTTAC 


1051 


CTCTCTCTAT 


1101 


AGCAAGCTCT 


1151 


GAGATGCTCG 


1201 


TTTGTTCCTG 


1251 


AGAAAAACTT 


1301 


CTCAATTTAA 


1351 


TCTTTTGACA 


1401 


CACAACAGAG 


1451 


AGTACCCCCC 


1501 


ACTGTTTTCC 



TTCGAAATTT 
GCACAAGTTC 
TAATAAAAGC 
ACATCTTTCT 
GATGCTGGGG 
CCGTTGCAAC 
CCATTTCGT^A 
TACTTAGCGT 
T?TATAGTCTT 
GTGGGAACTA 
TTAGGTTCTA 
CCTGGTGTTT 
CCCACAATCT 
CATGCTTATC 
TGGAGGATCG 
GAAATACAGC 
CAATCTGGAG 
TTATTTCTAT 
TTGTAATCAA 
TTCTCAGGAC 
TTCCACAATC 
CGGATGGGGT 
ACGCTTACTA 
GGTTCAGAAT 
TAAGGATTCG 
AAAGTTGCCT 
GGAAGCCTTT 
GTCTTCTCCT 
AATGACGCCG 
TTCTC TGGAT 
TCACTTGGAA 



TCTTACCTGC 

TATATCTTCA 

TTAGAACCTA 

AGATGACGTG 
TTTTTATAAA 

TTCACTTTTC 
CCGCGTTGGA 
TCACCTCAGC 
GGTTCCGTGA 
CTCTTCGTGG 
AGGCGAGTCG 
AGAGACAATG 
CACACTCACG 
ATGACGTGAA 
ATCTCTATAT 
ATCACAAGAC 
CACAGTTTAA 
GATCCTATAA 
TGCTCCTGAA 
TATGCCTGGA 
CTACAAGATG 
TACCTTGCT^A 
TGTCTCCAGG 
CTGCACATCC 
CGCCGAGGAC 
TTGAGGCTTA 
ACGATTCCTC 
AGGGGAGACC 
TTCGAGGTTT 
AAAGACAGAA 
TCCTGAGATC 



TCGGCTTTAT 
TGAAAGTGAT 
AAATTACCTG 
AGGATTTCCA 
TCGATCTGGG 
ACAACCTTAT 
GACACCACTC 
ACCTCTACTA 
TGATCGAAAA 
AGTGGAG CTG 
TCCTTCAGTA 
TGAGCCAAGG 
ACTCGAGGAC 
TAGTAATGGA 
CCGTGAAAAG 
GGAAATACAA 
GAACCTACGT 
GCCATAGCGA 
GGAAAGGAAA 
TGATCATGAA 
TCACATTAGC 
CTGCATTCTT 
AACCACTCTG 
TGATTGAAGA 
AAGGATGCTC 
TTGGTCCGTC 
T TCTTG AAC T 
ACTTTGGAGA 
CTGGTCCCTA 
GGATC AC AC C 
ACTTCTACGC 



TCCTCGCTCT 
GGTTATAACG 
TTATCCAGAA 
ACGTTAAGCA 
AATCTTTTTT 
GACCGAGGGT 
TCACTCTCTC 
CCTCAAGGAC 
TAGTGAGGAA 
CGATTTATAC 
AATCTCAGCG 
TTATGGCGGC 
CTTCGTGTTT 
GGAGCCATTG 
CGGAGATCTC 
TACACAACTC 
GCTGTTTCAG 
GTCGCATAAA 
CTTATGAAGG 
GTTTGTGCGG 
AGGAGGAACT 
TTAAGCAGGA 
CTCTGCTCAG 
TACCGACAAC 
TTGTCTCATT 
TATGACTTTC 
TCTAGGGCCT 
GAACCCAAGT 
AGCTGGGAAG 
AACTAAGAAA 
CATAA 



The PSORT algorithm predicts an outer membrane location (0.922). 



The protein was expressed in E.coli and purified as a as a his-tag product and as a GST-fusion 
product, as shown in Figure 26A. The recombinant GST-fusion protein was used to immunise mice, 
whose sera were used in a Western blot (Figure 26B). 

These experiments show that cp6735 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 



Example 27 

The following ^pneumoniae protein (pid 4376784) was expressed <SEQ ID 53; cp6784>: 

1 MNRRKARWW ALFAMTALIS VGCCFWSQAK SRCSIDKYIP WNRJJLEVCG 

51 LPEAENVEDL IESSSAWVLT PEERFSGELV SICQVKDEHA FYNDLSLLHM 

101 TQAVPSYSAT YDCAWFGGP LPALRQRLDF LVREWQRGVR FKKIVFIjCGK 

151 RGRYQSIEEQ EHFPDSRYNP FPTEENWESG NRVTPSSEEE IAKFVWMQML 
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201 LPRAWRDSTS GVRVTFLLAK PEENRWANR KDTLLLFRSY QEAFPGRVLF 
251 VSSQPFIGLD ACRVGQFFKG E S YDLAG PGF AQGVLKYHWA PRICLHTLAE 
301 WLKETNGCLN ISEGCFG* 

A predicted signal peptide is highlighted. 

The cp6784 nucleotide sequence <SEQ ID 54> is: 

1 ATGAATAGAA GAAAAGCAAG ATGGGTAGTG GCATTGTTCG CAATGACGGC 
51 GCTCATTTCT GTTGGGTGTT GTCCTTGGTC ACAAGCGAAA TCAAGATGTT 
101 CTATTGATAA GTATATTCCT GTAGTCAATC GTTTACTAGA AGTTTGTGGA 
151 CTTCCTGAAG CTGAGAATGT TGAGGATTTA ATCGAGTCCT CGTCTGCTTG 
201 GGTACTGACT CCTGAAGAAC GTTTTTCTGG AGAGTTAGTC TCTATCTGTC 
251 AGGTTAAAGA TGAGCATGCT TTCTATAACG ATTTGTCTTT ATTACATATG 
301 ACTCAGGCTG TGCCTTCGTA TTCTGCAACG TATGATTGTG CTGTAGTTTT 
351 TGGCGGGCCT TTGCCAGCGC TACGTCAGCG CTTAGATTTT TTGGTGCGAG 
401 AGTGGCAGCG TGGCGTGCGC TTTAAGAAAA TCGTTTTTCT ATGTGGAGAG 
451 CGAGGGCGCT ATCAGTCTAT TGAAGAACAA GAGCATTTCT TTGATTCTCG 
501 GTACAATCCT TTCCCTACTG AAGAGAACTG GGAATCTGGT AACCGAGTTA 
551 CTCCCTCTTC TGAAGAAGAG ATTGCCAAAT TTGTTTGGAT GCAAATGCTT 
601 TTACCTAGAG CATGGCGAGA TAGTACTTCA GGAGTCAGAG TGACATTTCT 
651 TCTAGCAAAG CCAGAGGAAA ATCGTGTGGT TGCGAATCGT AAGGACACCT 
701 TACTTTTATT CCGTTCTTAT CAAGAAGCGT TTCCGGGACG CGTGTTATTT 
751 GTAAGTAGTC AACCCTTTAT CGGTTTAGAT GCTTGCAGGG TCGGGCAGTT 
801 TTTCAAAGGG GAAAGCTATG ATCTTGCTGG ACCTGGATTT GCTCAAGGAG 
851 TCTTGAAGTA TCATTGGGCT CCAAGGATTT GTCTACATAC TTTAGCGGAA 
901 TGGTTAAAGG AAACGAACGG CTGCTTAAAT ATTTCAGAGG GTTGTTTTGG 
951 ATGA 

The PSORT algorithm predicts a periplasmic location (0.894). 

The protein was expressed in E.coli and purified as a his-tag product and as a GST-fusion product, as 
shown in Figure 27 A. The recombinant proteins were used to immunise mice, whose sera were used 
in a Western blot (Figure 27B). The GST-fusion product was used for FACS analysis (Figure 27C). 

The cp6784 protein was also identified in the 2D-PAGE experiment (Cpn0498). 

These experiments show that cp6784 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogeri. These properties are not evident from the sequence alone. 

Example 28 

The following C.pnewnoniae protein (pid 437 6960) was expressed <SEQ ID 55; cp6960>: 

1 MNRRWNLVLA TVALALSVAS CPVRSK PKDK DQGSLVEYKD NKDTNDIELS 

51 DNQKLSRTFG HLLARQLRKS EDMFFDIAEV AKGLQAELVC KSAPLTETEY 

101 EEKMAEVQKL VFEKKSKENL SLAEKPLKEN SKNAGWEVQ PSKLQYKIIK 

151 EGAGKAISGK PSALLHYKGS FINGQVFSSS EGNNE P I LLP LGQTIPGFAL 

201 GMQGMKEGET RVLYIHPDLA YGTAGQLPPN SLLIFEINLI QASADEVAAV 

251 PQEGNQGE* 



A predicted signal peptide is highlighted. 

The cp6960 nucleotide sequence <SEQ ID 56> is: 



1 ATGAACAGAC GGTGGAATTT AGTTTTAGCA ACAGTAGCTC TGGCACTCTC 

51 CGTCGCTTCT TGTGACGTAC GGTCTAAGGA TAAAGACAAG GATCAGGGGT 

101 CGTTAGTGGA ATATAAAGAT AACAAAGATA CCAATGACAT AGAATTATCC 

151 GATAATCAAA AGTTATCCAG AACATTTGGT CATTTATTAG CACGCCAATT 

201 ACGCAAGTCA GAAGATATGT TTTTTGATAT TGCAGAAGTG GCTAAGGGGT 

251 TGCAGGCGGA ATTGGTTTGT AAAAGTGCTC CTTTAACAGA AACAGAGTAT 

301 GAAGAAAAAA TGGCTGAAGT ACAGAAGTTG GTTTTTGAAA AAAAATCAAA 

351 AGAAAATCTT TCATTGGCAG AAAAATTCTT AAAAGAAAAT AGCAAGAACG 

401 CTGGTGTTGT TGAAGTGCAA CCAAGTAAAT TGCAATACAA AATTATTAAA 
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451 GAAGGTGCAG GGAAAGCAAT TTCAGGTAAA CCTTCAGCTC TATTGCACTA 

501 CAAGGGTTCC TTCATCAATG GCCAAGTATT TAG C AGTTC A GAAGGCAACA 

551 ATGAGCCTAT CTTGCTTCCT CTAGGCCAAA CAATTCCTGG TTTTGCTTTA 

601 GGTATGCAGG GCATGAAAGA AGGAGAAACT CGAGTTCTCT ACATCCATCC 

651 TGATCTTGCT TACGGAACCG CAGGACAACT TCCTCCAAAC TCTTTATTAA 

7 01 TTTTTGAAAT TAACTTGATT CAGGCTTCAG CAGATGAAGT TGCTGCTGTA 

7 51 CCCCAAGAAG GAAATCAAGG TGAATGA 

The PSORT algorithm predicts periplasmic space location (0.930). 

The protein was expressed in E.coli and purified as a his-tag product and as a GST- fusion product, as 
shown in Figure 28A. The recombinant proteins were used to immunise mice, whose sera were used 
in a Western blot (Figure 28B) and for FACS analysis (Figure 28C). 

The cp6960 protein was also identified in the 2D-PAGE experiment. 

These experiments show that cp6960 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 29 

The following C.pneumoniae protein (pid 4376968) was expressed <SEQ ID 57; cp6968>: 

1 MKFLLYVPLIi LVIATSTGC DA KPVSFEPFSG KLSTQRFEPQ HSABEYFSQG 

51 QEFLKKGNFR KALLCFGIIT HHFPRDILRN QAQYLIGVCY FTQDHPDLAD 

101 KAFASYLQLP DAEYSEELFQ MKYAIAQRFA QGKRKRICRL EGFPKLMNAD 

151 EDALRIYDEI LTAFPSKDLG AQALYSKAAL LIVKNDLTEA TKTLKKLTLQ 

201 FPLHILSSEA FVRliSEIYLQ QAKKEPHNLQ YLHFAKLNEE AMKKQHPNHP 

251 LNEWSANVG AMREHYARGL YATGRFYEKK KKAEAANIYY RTAITNYPDT 

301 LLVAKCQKRL DRISKHTS* 



A predicted signal peptide is highlighted. 
The cp6968 nucleotide sequence <SEQ ID 58> is: 



1 ATGAAATTTC TATTATACGT TCCACTTCTT CTTGTTCTCG TATCTACGGG 
51 GTGCGATGCA AAACCTGTTT CTTTTGAGCC CTTTTCAGGA AAGCTTTCCA 
101 CCCAGCGTTT TGAGCCTCAG CACTCTGCTG AAGAATATTT TTCTCAGGGA 
151 CAGGAATTCT TAAAAAAAGG AAATTTCAGA AAAGCTTTAC TATGCTTTGG 
201 AATCATTACG CATCACTTCC CTAGGGACAT CTTGCGTAAT CAAGCACAGT 
251 ATCTTATAGG AGTCTGTTAC TTCACGCAGG ATCACCCAGA TTTAGCAGAC 
3 01 AAGGCATTTG CATCTTACTT ACAACTTCCT GATGCGGAGT ACTCTGAAGA 
351 GTTGTTCCAG ATGAAATATG CGATTGCTCA AAGATTTGCT CAAGGGAAGC 
401 GTAAACGGAT TTGTCGATTA GAGGGCTTCC CAAAACTAAT GAATGCTGAT 
451 GAAGATGCGC TACGCATTTA TGACGAGATT CTAACAGCGT TTCCTAGTAA 
501 AGACTTAGGA GCTCAGGCCC TCTATAGTAA AGCTGCGTTA CTTATTGTAA 
551 AAAACGATCT TACAGAAGCC ACCAAAACCT TAAAAAAACT CACGTTACAA 
601 TTTCCTCTAC ATATTTTATC TTCAGAGGCC TTTGTACGTT TATCGGAAAT 
651 CTATTTACAG CAAGCTAAGA AAGAGCCTCA CAATCTTCAA TATCTTCATT 
701 TTGCAAAGCT TAATGAAGAG GCAATGAAAA AGCAGCATCC TAACCATCCT 
751 CTGAATGAGG TTGTTTCTGC TAATGTTGGA GCTATGCGGG AACATTATGC 
801 TCGAGGTTTG TATGCCACAG GTCGTTTCTA TGAGAAGAAG AAAAAAGCCG 
851 AGGCTGCGAA TATCTATTAC CGCACTGCGA TTACAAACTA CCCAGACACT 
901 TTATTAGTGG CTAAATGTCA AAAGCGTCTA GATAGAATAT CTAAGCATAC 
951 TTCCTAA 

The PSORT algorithm predicts an inner membrane location (0.790). 

The protein was expressed in E.coli and purified as a his-tag product and as a GST-fusion product, as 
shown in Figure 29A. The recombinant GST-fusion was used to immunise mice, whose sera were 
used in a Western blot (Figure 29B) and for FACS analysis (Figure 29C). 
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This protein also showed good cross-reactivity with human sera, including sera from patients with 
pneumonitis. 

These experiments show that cp6968 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 30 

The following C.pneumoniae protein (pid 4376998) was expressed <SEQ ID 59; cp6998>: 

1 MKKLLKSALL SAAFAGSVGS LgALPVGNPS DPSLLIDGTI WEGAAGDPCD 

51 PCATWCDAIS LRAGFYGDYV FDRILKVDAP KTFSMGAKPT GSAAANYTTA 

101 VDRPNPAYNK HLHDAEWFTN AGFIALNIWD RFDVFCTLGA SNGYIRGNST 

151 AFNLVGLFGV KGTTVNANEL PNVSLSNGW ELYTDTSFSW SVGARGALWE 

201 CGCATLGAEF QYAQSKPKVE ELNVICNVSQ FSVNKPKGYK GVAFPLPTDA 

251 GVATATGTKS AT INYHEWQV GASLSYRLNS LVPYIGVQWS RATFDADNIR 

301 I AQPKLPTAV LNLTAWNPSL LGNATALSTT DSFSDFMQIV SCQINKFKSR 

351 KACGVTVGAT LVDADKWSLT AEARLINERA AHVSGQFRF* 

A predicted signal peptide is highlighted. 



The cp6998 nucleotide sequence <SEQ ID 60> is: 



1 


ATGAAAAAAC 


51 


TGTTGGCTCC 


101 


TATTAATTGA 


151 


CCTTGCGCTA 


201 


AGACTATGTT 


251 


CTATGGGAGC 


301 


GTAGATAGAC 


351 


GTTCACTAAT 


401 


TTTTCTGTAC 


451 


GCGTTCAATC 


501 


AAATGAACTA 


-551 


CAGACACCTC 


601 


TGCGGTTGTG 


651 


TAAAGTTGAA 


701 


ACAAACCCAA 


751 


GGCGTAGCAA 


801 


ATGGCAAGTA 


851 


ACATTGGAGT 


901 


ATTGCTCAGC 


951 


CCCTTCTTTA 


1001 


CAGACTTCAT 


1051 


AAAGCTTGTG 


1101 


GTCACTTACT 


1151 


CTGGTCAGTT 



TCTTAAAGTC 
TTACAAGCCT 
TGGTACAATA 
CTTGGTGCGA 
TTCGACCGTA 
CAAGCCTACT 
CTAACCCGGC 
GCAGGCTTCA 
TTTAGGAGCT 
TCGTTGGTTT 
C C AAACGTTT 
TTTCTCTTGG 
CAACTTTGGG 
GAACTTAATG 
GGGCTATAAA 
CAGCTACTGG 
GGAGCCTCTC 
ACAATGGTCT 
CAAAACTACC 
CTAGGAAATG 
GCAAATTGTT 
GAGTTACTGT 
GCAGAAGCTC 
CAGATTCTAA 



GGCGTTATTA 
TGCCTGTAGG 
TGGGAAGGTG 
CGCTATTAGC 
TCTTAAAAGT 
GGATCCGCTG 
CTACAATAAG 
TTGCCTTAAA 
TCTAATGGTT 
ATTCGGAGTT 
CTTTAAGTAA 
AGCGTAGGCG 
AGCTGAATTC 
TGATCTGTAA 
GGCGTTGCTT 
AACAAAGTCT 
TATCTTACAG 
CGAGCAACTT 
TACAGCTGTT 
CCACAGCATT 
TCCTGTCAGA 
AGGAGCTACT 
GTTTAATTAA 



TCCGCCGCAT 
GAACCCTTCT 
CTGCAGGAGA 
TTACGTGCTG 
AGATGCACCT 
CTGCAAAC T A 
CATTTACACG 
CATTTGGGAT 
ACATTAGAGG 
AAAGGTACTA 
CGGAGTTGTT 
CTCGTGGAGC 
CAATATGCAC 
CGTATCGCAA 
TCCCCTTGCC 
GCGACCATCA 
AC T AAACTCT 
TTGATGCTGA 
TTAAACTTAA 
GTCTACTACT 
TCAACAAGTT 
TTAGTTGATG 
CGAGAGAGCT 



TTGCTGGTTC 
GATCCAAGCT 
TCCTTGCGAT 
GATTTTACGG 
AAAACATTTT 
TACTAC TGCC 
ATGCAGAGTG 
CGCTTTGATG 
AAACTCTACA 
CTGTAAATGC 
GAACTTTACA 
CTTATGGGAA 
AGTCCAAACC 
TTCTCTGTAA 
AACAGACGCT 
ATTATCATGA 
TTAGTGCCAT 
TAACATCCGC 
CTGCATGGAA 
GATTCGTTCT 
TAAATCTAGA 
CTGATAAATG 
GCTCACGTAT 



The PSORT algorithm predicts an outer membrane location (0.707). 

The protein was expressed in Kcoli and purified as a GST-fusion (Figure 30A) and as a his-tag 
product. The recombinant GST-fusion protein was used to immunise mice, whose sera were used in a 
Western blot (Figure 30B) and for FACS analysis (Figure 30C). 

The cp6998 protein was also identified in the 2D-PAGE experiment (Cpn0695) and showed good 
cross-reactivity with human sera, including sera from patients with pneumonitis. 

These experiments show that cp6998 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 
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Example 31 



The following C.pneumoniae protein (pid 4377102) was expressed <SEQ ID 61; cp7102>: 



l 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 



MKHT FTKRVXi FFFFLVIPIP LLLNIiMVVGF FSFSAAKANL 



LSIEFEKKI/T 
FSLCLIDPFD 
KPLLHYLILV 
LVNKYGEVLrF 
ENLITVSINK 
FYVLAFLLMW 
EFNELGNI FN 
DFPTFPKVTF 
YLYALSARSL 
VEKDRSLELL 
EDILKYFSQL 
LSFS* 



IHKLFLDRLA 
GSVRTKNPGD 
EDVASWDSTT 
CAQDSESSFV 
KRYLGLVL.NK 
WIFSKINTKL 
CTLLLIiLNSI 
SSQHLRRRQL 
FLAYAS SDVS 
SLSEGAPTMF 
PIEELLKDPL 



NTLALKSYAS 
PFIRYLKQHP 
TSGLLVSFYP 
FSLDLPNLPQ 
IPIQGTYTLS 
NKPLQELTFC 
EKADIDYHSG 
S GHFNG WTVQ 
LQKISKDTAD 
LQRGESFVRL 
NPLNTENLID 



P SAE PYAQAY 
EMKKKLSAAV 
MSFLQKDLFQ 
FQARSPSAIE 
LVPVSDLIQS 
MEAAWRGNHN 
EKLQKELGIL 
DGGDTLLGII 
SFSKTTEGNE 
PLETHQALQ P 
SLTMMLNNET 



VQVLHTRATN 
NEMMALSNTD 
GKAFLLTIPG 
SLHITKGNIC 
IEKASGILGG 
ALKVPLNICF 
VRFEPQPYGY 
SSLQSALLSP 
GLAGDIGLPS 
AWAMTF IKY 
GDRLICLTGG 
EHSADGTLTI 



A predicted signal peptide is highlighted. 

The cp7102 nucleotide sequence <SEQ ID 62> is: 



l 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 



ATGAAACATA 
TCCCATTCCC 
CTGCCGCTAA 
TTAAGTATAG 
TAGACTTGCC 
AGCCCTATGC 
TTTTCCTTAT 
TCCTGGAGAC 
AAAAGCTATC 
AAACCACTTT 
TTCTACAACG 
TACAGAAAGA 
CTTGTAAATA 
TTCTTTTGTA 
GAAGCCCCTC 
GAGAACC TAA 
ACTGAATAAA 
TTTCTGATCT 
TTCTATGTAC 
CACCAAACTT 
CCTGGCGAGG 
GAATTCAATG 
GAATTC C ATT 
AAAAAGAATT 
GATTTC CCT A 
AAGGCAACTT 
ATACCCTTTT 
TATCTCTATG 
GGACGTTTCG 
AAACAACAGA 
GTAGAAAAAG 
TACCATGTTT 
CTCACCAAGC 
GAAGACATCC 
AGATCCTTTA 
TGATGTTAAA 
CTTTCATTTT 



CCTTTACCAA 
CTACTCCTCA 
AGCAAATTTA 
AATTCGAAAA 
AACACATTAG 
ACAGGCATAC 
GCCTTATAGA 
CCTTTCATTC 
CGCAGCTGTA 
TACATTATCT 
ACTTCAGGAC 
TTTATTCCAA 
AGTATGGCGA 
TTTTCTCTAG 
TGCCATAGAA 
TCACAGTGAG 
ATTCCTATCC 
CATCCAATCC 
TTGCTTTCCT 
AACAAGCCTC 
AAACCATAAC 
AACTAGGAAA 
GAGAAAGCAG 
AGGGATTTTA 
CGTTCCCTAA 
TCCGGTCATT 
AGGGATCATA 
CTTTATCCGC 
TTACAAAAAA 
AGGCAATGAG 
ATCGATCTCT 
CTACAACGAG 
TCTACAGCCT 
TCAAGTACTT 
AACCCTCTAA 
CAACGAAACC 
CATAA 



GCGTGTTCTA 
ATCTTATGGT 
GTACAGGTCC 
AAAACTGACG 
CCTTAAAATC 
AATGAGATGA 
TCCCTTTGAT 
GCTATCTAAA 
GGGAAAGCCT 
TATTCTAGTT 
TGCTTGTAAG 
TCCTTACACA 
GGTCCTCTTC 
ATCTCCCTAA 
ATTGAGAAAG 
TATCAACAAG 
AAGGGAC CTA 
GCCTTGAAAG 
CCTCATGTGG 
TTCAAGAACT 
GTGAGGTTTG 
TATTTTCAAT 
ATATCGATTA 
TCTTCACTAC 
AGTTACCTTT 
TTAATGGTTG 
GGGCTCGCTG 
ACGGAGTCTT 
TCAGCAAGGA 
GCTGTAGTTG 
AGAGCTCCTC 
GAGAATCTTT 
GGAGATCGGT 
TTCTCAGCTT 
ATACAGAGAA 
GAACATTCTG 



TTTTTTTTCT 
CGTAGGTTTT 
TCCATACCCG 
ATACACAAGC 
CTATGCATCT 
TGGCACTCTC 
GGATCTGTAA 
ACAGCATCCT 
TTTTATTGAC 
GAAGATGTCG 
TTTCTATCCC 
TCACCAAAGG 
TGTGCTCAGG 
TTTACCGCAA 
CTTCTGGAAT 
AAACGCTACC 
CACTCTATCT 
TTCCTCTCAA 
TGGATTTTCT 
GACCTTCTGT 
AACCCCAGCC 
TGCACTCTCC 
CCATTCAGGC 
AAAGTGCGTT 
AGTTCCCAAC 
GACAGTTCAA 
GCGATATTGG 
TTTCTTGCCT 
TACTGCCGAC 
CTATGACTTT 
TCGTTAAGCG 
CGTACGTCTC 
TGATCTGCCT 
CCTATTGAAG 
TCTTATTGAT 
CAGATGGAAC 



TTTTAGTGAT 
TTCTCATTTT 
TGCTACGAAC 
TTTTCCTCGA 
CCTTCTGCAG 
CAATACAGAC 
GGACGAAAAA 
GAAATGAAGA 
CATTCCAGGT 
CATCTTGGGA 
ATGTCTTTTT 
AAATATCTGC 
ACAGTGAATC 
TTCCAAGCAA 
TCTTGGTGGG 
TAGGATTGGT 
TTAGTTCCAG 
TATTTGTTTT 
CTAAGATCAA 
ATGGAAGCTG 
TTACGGTTAT 
TACTCTTATT 
GAAAAATTAC 
ACTAAGTCCG 
ATCTCCGGAG 
GATGGTGGCG 
TCTTCCTTCC 
ATGCTTCCTC 
AGCTTCTCAA 
CATTAAATAT 
AGGGAGCTCC 
CCCTTAGAGA 
CACTGGAGGA 
AGCTCTTAAA 
TCTCTAACCA 
TCTGACCATC 



The PSORT algorithm predicts an inner membrane location (0.338). 

The protein was expressed in E.coli and purified as a his-tag product and as a GST-fusion product. 
The purified GST-fusion product is shown in Figure 31 A. The recombinant GST-fusion protein was 
used to immunise mice, whose sera were used in a Western blot and for FACS analysis (Figure 3 IB). 
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These experiments show that cp7102 is a surface-exposed and inununoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 32 

The following ^pneumoniae protein (pid 4377106) was expressed <SEQ ID 63; cp7106>: 

1 MKDLGTLGGT SSTAKTVSPD GKVIMGRSQI ADGSWHAFMC HTDFSSNNVL 

51 FDLDNTYKTIi RENGRQLNSI FNLQNMMLQR ASDHEFTEFG RSNIALGAGL 

101 YVNALQNLPS NIjAAQYFG I A YKIRPKYRLG VFLDHNFSSH VPNNFNVSHN 

151 RLWMGAFIGW QDSDALGSSV KVSFGYGKQK ATITREQLEN TEAGSGESHF 

201 EGVAAQIEGR YGKSLGGHVR VQPFLGL»QFV HITRKEYTEN AVQFPVHYDP 

251 IDYSTGWYL GIGSHIALVD SLHVGTRMGM EQNFAAHTDR FSGSIASIGN 

301 FVFEKLDVTH TRAFAEMRVN YELPYLQSLN LILRVNQQPIi QGVMGFSSDL 

351 RYALGF* 

The cp7106 nucleotide sequence <SEQ ID 64> is: 

1 ATGAAAGATT TGGGGACTCT TGGGGGTACC TCTTCTACAG CAAAAACAGT 

51 GTCCCCAGAT GGTAAAGTGA TCATGGGTAG ATCACAAATT GCTGATGGCA 

101 GTTGGCACGC ATTTATGTGT CATACGGATT TCTCCTCTAA TAATGTACTC 

151 TTTGATCTCG ATAATACGTA TAAAACTCTA AGAGAAAATG GCCGTCAGCT 

201 AAATTCCATA TTCAACCTAC AAAATATGAT GTTACAGAGA GCCTCAGATC 

251 ATGAGTTCAC AGAGTTTGGA AGGAGTAACA TCGCTCTTGG TGCCGGGCTT 

301 TATGTGAATG CCTTGCAGAA TCTCCCTAGC AATTTAGCAG CACAATATTT 

351 TGGAATCGCA TACAAAATAC GTC CTAAATA TCGTTTGGGG GTGTTTTTGG 

401 ACCATAATTT CAGCTCCCAC GTTCCTAATA ATTTTAACGT AAGCCACAAT 

451 AGAC TCTGGA TGGGAGCCTT TATTGGATGG CAGGATTCTG ATG CTCTAGG 

501 ATCTAGTGTC AAGGTGTCTT TCGGATATGG AAAACAAAAA GCCACGATTA 

551 CAAGAGAGCA ATTAGAGAAT ACAGAAGCCG GGAGTGGGGA GAGCCATTTT 

601 GAAGGGGTCG CTGCTCAGAT AGAAGGGCGG TATGGTAAGA GCCTCGGAGG 

651 ACATGTCAGG GTCCAGCCTT TCCTAGGACT GCAGTTTGTC CACATTACAA 

701 GGAAAGAATA TACCGAAAAT GCAGTGCAAT TTCCTGTACA CTATGATCCT 

751 AT AGAC T ATT CTACAGGTGT AGTGTATTTA GGAATTGGAT CTCATATTGC 

801 ACTTGTAGAT TCTTTACATG TAGGCACACG CATGGGAATG GAGCAAAACT 

851 TTGCAGCCCA TACGGACAGG TTCTCAGGAT CTATAGCGTC TATTGGAAAC 

901 TTTGTGTTTG AAAAGCTTGA TGTGACTCAC ACAAGGGCAT TTGCGGAAAT 

951 GCGTGTCAAC TATGAGCTTC CCTATCTACA GTCTCTGAAT CTTATTCTAC 

1001 GAGTTAATCA ACAGCCTCTA CAAGGGGTTA TGGGATTTTC CAGTGATCTT 

1051 AGGTATGCCT TAGGATTCTA A 

The PSORT algorithm predicts a cytoplasmic location (0.224). 

The protein was expressed in E.coli and purified as a his-tag product and as a GST-fusion product. 
The purified GST-fusion product is shown in Figure 32A. The recombinant GST-fusion protein was 
used to immunise mice, whose sera were used in a Western blot (Figure 32B) and for FACS analysis 
(Figure 32C). 

This protein also showed very good cross-reactivity with human sera, including sera from patients 
with pneumonitis. 

These experiments show that cp7106 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 33 

The following C.pneumoniae protein (pid. 4377228) was expressed <SEQ ID 65; cp7228>: 

1 MTAVLILTSF PSEESARSLA RHLITERLAS CVHVFPKGTS TYLWEGKLCE 
51 SEEHHIQIKS IDIRFSEICIi AIQEFSGYEV PEVLLFPIEN GDPRYLNWLT 
101 ILSYPEKPPL SD* 
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The cp7228 nucleotide sequence <SEQ ID 66> is: 

1 ATGACTGCTG TTCTTATTCT TACATCTTTC CCTTCGGAGG AAAGTGCTCG 

51 CTCCTTAGCT AGACATCTGA TTACAGAGCG TCTTGCTTCC TGTGTGCATG 

101 TATTCCCTAA AGGCACATCG ACATATCTAT GGGAAGGCAA GCTATGTGAG 

151 TCTGAAGAAC ATCATATACA AATCAAATCG ATAGACATAC GCTTCTCGGA 

2 01 AATTTGTCTT GCTATTCAGG AGTTC TCTGG CTATGAGGTT CCTGAAGTCT 

251 TACTATTTCC TATTGAAAAT GGGGATCCGA GGTAC TTGAA TTGGTTAACG 

301 ATTCTCAGCT ATCCAGAGAA GCCTCCGCTT TCAGATTAG 

The PSORT algorithm predicts an inner membrane location (0.040). 

The protein was expressed in Kcoli and purified as a his-tag product and as a GST-fusion product, as 
shown in Figure 33A (his-tag = left-hand arrow, GST = right-hand arrow). The proteins were used to 
immunise mice, whose sera were used in a Western blot (Figure 33B) and FACS analysis. 

These experiments show that cp7228 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 34 

The following C.pneumoniae protein (pid 4377170) was expressed <SEQ ID 67; cp7170>: 

1 MNSKMLKHIiR lATLSFSMFF GIVSSPAVYA liGAGNPAAPV LPGVNPEQTG 

51 WCAFQLCNSY DLFAALAGSL KFGFYGDYVF SESAHITNVP VITSVTTSGT 

101 GTTPTITSTT KNVDFDLNNS SISSSCVFAT IALQETSPAA IPLLDIAFTA 

151 RVGGLKQYYR LPLNAYRDFT SNPLNAESEV TDGLIEVQSD YGIVWGLSLQ 

201 KVLWKDGVSF VGVSADYRHG SSPINYIIVY NKANPEIYFD ATDGNLSYKE 

251 WSASIGISTY LNDYVLPYAS VSIGNTSRKA PSDSFTELEK QFTNFKFKIR 

301 KITNFDRVNF CFGTTCCISN NFYYSVEGRW GYQRAINITS GLQF* 

A predicted signal peptide is highlighted. 

The cp7170 nucleotide sequence <SEQ ID 68> is: 

1 ATGAATAGCA AGATGCTAAA ACATTTACGT TTAGCAACCC TTTCCTTCTC 

51 TATGTTCTTC GGGATTGTAT CTTCTCCCGC AGTATATGCC CTAGGGGCTG 

101 GAAACCCTGC AGCTCCAGTA CTCCCAGGTG TGAATCCTGA G C AAACGGG A 

151 TGGTGTGCCT TCCAACTTTG TAATAGTTAC GATCTTTTTG CTGCTCTTGC 

201 AGGAAGCCTC AAATTTGGGT TCTATGGAGA TTATGTCTTC TCAGAAAGTG 

251 CCCATATTAC CAATGTCCCT GTCATTACCT CCGTTACGAC TTCAGGCACA 

301 GGAACAACGC CAACCATTAC CTCTACAACT AAAAACGTAG ACTTTGATCT 

351 TAACAACAGC TCCATCAGCT CGAGCTGTGT TTTTGCAACC ATAGCTCTAC 

401 AGGAAACATC CCCAGCTGCC ATTCCCCTTT TAGATATAGC CTTCACTGCA 

451 CGTGTCGGAG GACTTAAGCA GTACTACCGC CTCCCTCTCA ATGCTTACAG 

501 AGACTTCACT TCAAATCCTT TAAATGCAGA ATCTGAAGTT ACAGATGGTC 

551 TCATTGAAGT CCAGTCAGAC TATGGAATTG TCTGGGGTCT GAGTTTACAA 

601 AAAGTATTGT GGAAAGATGG AGTGTCTTTT GTAGGGGTGA GCGCTGACTA 

651 CCGTCACGGT TCCAGTCCCA TCAACTATAT CATCGTTTAC AACAAGGCCA 

701 ACCCCGAGAT CTATTTCGAT GCTACTGATG GAAACC TAAG CTATAAAGAA 

751 TGGTCTGCAA GCATCGGCAT CTCTACGTAT CTTAATGACT ATGTGCTTCC 

801 CTATGCATCC GTATCTATAG GAAATACTTC AAGAAAAGCT CCTTCTGATA 

851 GCTTCACAGA ACTCGAAAAG CAATTTACGA ATTTTAAATT TAAAATTCGT 

901 AAAATCACAA ACTTCGACAG AGTAAACTTC TGCTTCGGAA CTACCTGCTG 

951 CATCTCAAAT AACTTCTACT ATAGTGTAGA AGGCCGTTGG GGATATCAGC 

1001 GTGCTATCAA CATTACGTCA GGTCTGCAGT TTTAG 

The PSORT algorithm predicts a bacterial outer membrane location (0.936). 

The protein was expressed in Kcoli and purified as a his-tag product and as a GST-fusion product. 
The purified GST-fusion product is shown in Figure 34A. The GST-fusion protein was used to 
immunise mice, whose sera were used in a Western blot (34B) and for FACS analysis (34C). 
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The cp7170 protein was also identified in the 2D-PAGE experiment (Cpn0854). 

These experiments show that cp7170 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 35 

The following C.pneumoniae protein (pid 4377072) was expressed <SEQ ID 69; cp7072>: 

1 MPIKKXiFCLF LCSSLIAMSP IYGKTGDYEK LTLTGINIID RNG L SET ICS 

51 KEKIjKKYTKV DFLAPQPYQK VMRMYKNKRG DNVSCLTAYH TNGQIKQYLE 

101 CliNNRAYGRY REWHVNGN IK IQAEVIGGIA DLHPSAESGW LFDQTTFAYN 

151 DEGILEAAIV YEKGLLEGSS VYYHTNGNIW KEC P YHKGVP QGKFLTYTSS 

201 GKKLKEQNYQ QGKRHGL SIR YSEDSEEDVL AWEEYHEGRL LKAEYLDPQT 

251 HEIYATIHEG NGIQAIYGKY AVIETRAFYR GEPYGKVTRF DNSGTQIVQT 

301 YNLLQGAKHG EEFFFYPETG KPKLLbNWHE GILNGIVKTW YPGGTLESCK 

351 EIjVKNKKSGL LTIYYPEGQI MATEEYDNDL LIKGEYFRPG DRHPYSKIDR 

401 GCGTAVFFSS AGTITKKIPY QDGKPLLN* 

A predicted signal peptide is highlighted. 



The cp7072 nucleotide sequence <SEQ ID 70> is: 



1 


ATGGATATAA 


51 


CATGAGTCCC 


101 


CAGGGATCAA 


151 


AAAGAGAAGC 


201 


CTATCAAAAG 


251 


CTTGTTTAAC 


301 


TGTCTCAATA 


351 


GAATATCAAA 


401 


CCTCAGCAGA 


451 


GATGAAGGTA 


501 


AGGATCTTCG 


551 


CCTATCATAA 


601 


GGGAAACTGC 


651 


TTCGATTCGC 


701 


AATATCATGA 


751 


CACGAAATCT 


801 


CGGCAAGTAT 


851 


ATGGAAAAGT 


901 


TATAACCTTT 


951 


TGAGACAGGG 


1001 


ATGGGATAGT 


1051 


GAACTCGTAA 


1101 


AGGACAGATC 


1151 


GAGAGTACTT 


1201 


GGTTGTGGGA 


1251 


AATCCCCTAT 



AAAAACTCTT 
ATTTATGGGA 
TATCATTGAT 
TAAAGAAATA 
GTCATGAGGA 
AGCCTATCAC 
ATCGTGCTTA 
ATCCAAGCTG 
GTCTGGCTGG 
TCTTAGAAGC 
GTGTATTACC 
GGGAGTTCCT 
TCAAAGAACA 
TACAGCGAAG 
GGGACGACTC 
ATGCGACTAT 
GCCGTTATAG 
TACCAGATTC 
TGCAAGGCGC 
AAACCCAAGC 
AAAAACTTGG 
ATAACAAAAA 
ATGGCGACCG 
CCGCCCTGGA 
CTGCAGTATT 
CAGGACGGCA 



TTGCTTATTT 
AAACAGGTGA 
AGAAACGGCC 
CACCAAGGTA 
TGTATAAAAA 
ACTAACGGGC 
TGGAAGATAT 
AGGTTATCGG 
CTATTTGATC 
CGCTATCGTC 
ATACTAATGG 
CAAGGTAAAT 
GAATTACCAA 
ATTCCGAAGA 
CTAAAAGCAG 
ACACGAAGGG 
AAACTAGGGC 
GACAACTCCG 
GAAGCACGGA 
TGCTTCTTAA 
TATCCCGGAG 
ATCCGGGTTA 
AAGAGTATGA 
GACCGTCATC 
TTTCTCGTCG 
AACCTTTGCT 



CTATGTTCTT 
CTATGAGAAA 
TGTCAGAAAC 
GACTTTCTTG 
CAAACGCGGA 
AAATTAAGCA 
CGTGAATGGC 
AGGTATTGCG 
AAACTACATT 
TATGAAAAAG 
GAATATTTGG 
TCCTGACATA 
CAAGGCAAAA 
AGATGTTTTA 
AGTACTTAGA 
AACGGCATTC 
ATTTTACCGA 
GAACACAGAT 
GAAGAATTTT 
TTGGCATGAA 
GAACCTTAGA 
CTGACCATTT 
TAATGATCTT 
CCTACTCTAA 
GCGGGAACTA 
CAACTAG 



CTCTAATTGC 
CTCACCCTTA 
TATTTGCTCT 
CTCCCCAGCC 
GATAACGTTT 
GTACCTGGAG 
ACGTCAACGG 
GATCTTCATC 
TGCCTATAAT 
GGCTGCTCGA 
AAAGAGTGTC 
CACATCTTCG 
GACACGGTCT 
GCCTGGGAAG 
TCCTCAAACT 
AAGCAATCTA 
GGGGAACCTT 
TGTCCAAACG 
TCTTTTATCC 
GGAATTTTAA 
AAGTTGTAAA 
ACTACCCTGA 
CTAATTAAAG 
AATAGATCGT 
TTACTAAAAA 



The PSORT algorithm predicts a periplasmic location (0.688). 

The protein was expressed in Rcoli and purified as a his-tag product (Figure 35A) and as a GST- 
fiision product (Figure35B). The recombinant his-tag protein was used to immunise mi ce, 
whose sera were used in a Western blot (Figure 35C) and for FACS analysis. 

These experiments show that cp7072 is a useful immunogen. These properties are not evident from 
the sequence alone. 



Example 36 



The following C.pneumoniae protein (pid 4376879) was expressed <SEQ ID 71; cp6879>: 
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1 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 



MATPAQKSPT 
TIVKVSLIIL 
CLYDSQGLPE 
PAVPQVWDC 
RSLVADRLEF 
LRSRIDDEQK 
QLEKDLRRQL 
FDEQSLFYRE 
EQKDANLKKA 
KVEKDFQELQ 
KLADLEGAAA 
SNELTQLVAD 
RKCCDLESLL 
G* 



FQDPSFVREL 
ALI/F I XjGGGL 
ELPPVPEPQQ 
EKRLGMLDRK 
NRRSYERFVQ 
RCWTALQRIN 
KSMQEWIEMR 
YKEKYLSQKL 
AAVWEEELGK 
QRYSRLQEEK 
PTE IGEDDDW 
AVEAEKEISK 
SPVREDAGMR 



GSNHPVFSPL* 
LVGLLtPAVPM 
IQIEDLRNET 
LRREEEILYR 
GIMTVRSEEG 
QSQKDIQRAH 
GTIHQQEKAW 
DMQKILQEVN 
QQQEDYEQTQ 
QVKEKILEES 
VLTDSASLSQ 
LREHIEEQKE 
FELEVEIiQRIi 



TLEERGEMAI 
FIGTGIilALG 
REVLEGTLLE 
STAHLKDEER 
EKEISRLQDL 
DREASQRACE 
RKQNAKLERIi 
AEKSEKACI,E 
EIRRLSTFIL 
MNHFADLFEK 
KKIRELVEEN 
GLRALDKMHA 
QEENAQLRAE 



ARVQQCGWNH 
AVI F ALAL I h 
VLLKDRDAKD 
YEFLLELLEM 
ISLQQQTVQD 
GTEMDCAERQ 
QEDLRLTGIA 
SLVHDYEKQL 
EYQDSLREAE 
AQKENMAYKK 
QEL.LKALAFK 
QAIKDCEAAQ 
VERLEQEQFQ 



The cp6879 nucleotide sequence <SEQ ID 72> is: 



l 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 



ATGGCAACAC 
AAGAGAGCTA 
AAAGAGGGGA 
ACAATTGTTA 
GGGAGGATTA 
CAGGTCTGAT 
TGTCTTTATG 
ACCACAACAA 
TTGAAGGGAC 
CCTGCGGTGC 
GGATCGTAAG 
ATCTTAAAGA 
CGTAGTCTGG 
ATTTGTTCAA 
TTTCTCGTCT 
TTAAGGAGTC 
ACGTATTAAC 
CTTCGCAGCG 
CAACTGGAGA 
TGAGATGAGG 
ATGCCAAATT 
TTTGAC GAAC 
TCAGAAACTA 
GTGAGAAGGC 
GAACAAAAAG 
ATTAGGGAAG 
GTCTGAGTAC 
AAAGTTGAGA 
AGAGGAGAAA 
TTGCCGATCT 
AAGTTAGCGG 
CGATGACTGG 
GCGAACTCGT 
TCTAACGAAT 
AATCAGCAAG 
CTCTTGATAA 
AGAAAATGCT 
TGGAATGAGA 
ATGCACAGCT 
GGATAA 



CCGCTCAAAA 
GGCAGTAACC 
GATGGCAATA 
AGGTAAGTCT 
CTCGTAGGAT 
TGCTTTGGGA 
ATTCTCAGGG 
ATTCAGATTG 
TCTTTTAGAG 
CCCAGGTGGT 
CTGCGACGTG 
CGAGGAAAGG 
TTGCCGATCG 
GGAATTATGA 
ACAAGATCTA 
GGATCGATGA 
CAATCTCAGA 
TGCCTGTGAG 
AGGATTTAAG 
GGCACAATCC 
AGAAAGATTA 
AATCTCTGTT 
GATATGCAAA 
TTGCTTAGAG 
ATGCTAATCT 
CAGCAACAGG 
ATTCATTCTT 
AAGATTTCCA 
CAGGTAAAAG 
CTTTGAGAAG 
ATTTAGAGGG 
GTACTCACAG 
GGAAGAGAAT 
TGACTCAACT 
CTTCGAGAAC 
GATGCATGCA 
GTGACCTTGA 
TTTGAGCTAG 
TAGAGCGGAG 



ATCCCCTACA 
ACCCTGTCTT 
GCTCGAGTCC 
TATTATTCTT 
TGCTGCCAGC 
GCCGTTATAT 
CCTTCCTGAG 
AAGATTTAAG 
GTTCTCTTAA 
TGTAGACTGT 
AAGAGGAGAT 
TATGAGTTCT 
GCTAGAATTT 
CAGTTAGATC 
ATCAGTTTGC 
CGAGCAGAAG 
AGGATATACA 
GGCACAGAGA 
GAGACAGCTG 
ATCAACAAGA 
CAAGAGGATC 
CTATCGCGAA 
AGATTTTACA 
AGTCTGGTCC 
GAAGAAAGCA 
AAGACTACGA 
GAGTACCAGG 
AGAGCTACAA 
AAAAAATCTT 
GCTCAAAAGG 
TGCCGCTGCT 
ATTCTGCTTC 
CAAGAACTCC 
GGTTGCCGAT 
ACATAGAAGA 
CAAGCGATCA 
GAGCCTTCTC 
AGGTCGAGCT 
GTTGAAAGAC 



TTTCAAGATC 
TTCCCCGCTA 
AGCAGTGTGG 
GCTCTTCTTA 
AGTTCCTATG 
TTGCTTTGGC 
GAACTCCCTC 
AAACGAGACC 
AGGATAGAGA 
GAAAAGCGTC 
TCTGTATCGC 
TGCTGGAGCT 
AACCGTAGAA 
AGAGGAGGGG 
AGCAGCAGAC 
AGATGCTGGA 
ACGGGCTCAT 
TGGATTGTGC 
AAATCTATGC 
GAAGGCTTGG 
TGAGACTTAC 
TATAAAGAGA 
GGAAGTCAAC 
ATGACTATGA 
GCAGCTGTTT 
ACAAACCCAA 
ACAGTCTGCG 
CAAAGGTATA 
AGAAGAAAGT 
AAAACATGGC 
C CTACTGAG A 
TCTCAGCCAG 
TGAAAGCACT 
GCTGTAGAAG 
GCAGAAAGAA 
AAGATTGCGA 
TCTCCTGTTC 
TCAAAGATTG 
TAGAGCAAGA 



CTAGTTTTGT 
ACGCTTGAGG 
ATGGAATCAT 
CTATTTTAGG 
TTTATTGGAA 
TTTGATTTTA 
CGGTTCCTGA 
AGAGAAGTTC 
CGCTAAGGAC 
TTGGAATGTT 
TCGACGGCCC 
CTTGGAAATG 
GTTATGAGCG 
GAAAAAGAGA 
GGTGCAAGAT 
CGGCTTTACA 
GATCGCGAGG 
AGAACGCCAG 
AGGAGTGGAT 
CGTAAGCAGA 
TGGGATTGCT 
AATATCTGAG 
GCAGAGAAAA 
GAAGCAGCTC 
GGGAAGAAGA 
GAAATTAGAC 
TGAGGCAGAA 
GCCGTCTTCA 
ATGAATCATT 
CTACAAGAAG 
TCGGTGAGGA 
AAG AAGATC C 
TGCATTTAAA 
CTGAAAAAGA 
GGATTACGAG 
AGCTGCTCAG 
GAGAAGATGC 
CAAGAAGAAA 
GCAATTTCAA 



The PSORT algorithm predicts an inner membrane location (0.646). 

The protein was expressed in E.coli and purified as a his-tag product and as a GST-fusion product 
The purified GST-fusion product is shown in Figure 36A. The recombinant GST-fusion protein was 
used to immunise mice, whose sera were used in a Western blot (Figure 36B) and for FACS analysis. 

These experiments show that cp6879 is useful immunogen. These properties are not evident from 
the sequence alone. 
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Example 37 

The following C.pneumoniae protein (pid 437 67 67) was expressed <SEQ ID 73; cp6767>: 

1 MIKQIGRFFR AFIFIMPLSL TSCESKIDRN RIWIVGTNAT YPPFEYVDAQ 

51 GEWGFDIDL* AKAISEKLGK QLEVREFAFD ALILNLKKHR IDAILAGMSI 

101 TPSRQKEIAL. LPYYGDEVQE LMWSKRSLE TPVLPLTQYS SVAVQTGTFQ 

151 EHYLLSQPGI CVRSFDSTLE VIMEVRYGKS PVAVLEPSVG RWLKDF PNL 

201 VATRLEliPPE CWVLGCGLGV AKDRPEEIQT IQQAITDLKS EGVIQSLTKK 

251 WQDSEVAYE* 

The cp6767 nucleotide sequence <SEQ ID 74> is: 

1 ATGATAAAAC AAATAGGCCG TTTTTTTAGA GCATTTATTT TTATAATGCC 

51 TTTATC TTTA ACAAGTTGTG AGTCTAAAAT CGATCGAAAT CGCATCTGGA 

101 TTGTAGGTAC GAATGCTACA TATCCTCCTT TTGAGTATGT GGATGCTCAG 

151 GGGGAAGTTG TAGGTTTCGA TATAGATTTG GCAAAGGCAA TTAGTGAAAA 

201 ACTTGGCAAG CAATTGGAAG TTAGAGAATT CGCTTTCGAT GCTTTAATTT 

251 TAAATTTAAA AAAACATCGT ATCGATGCAA TTTTAGCAGG AATGTCCATT 

301 ACTCCTTCGC GTCAGAAGGA AATCGCCCTG CTTCCCTATT ATGGCGATGA 

351 GGTTCAAGAG CTGATGGTGG TTTCTAAGCG GTCTTTAGAG ACCCCTGTGC 

401 TTCCCCTAAC AC AGTATTC T TCTGTTGCTG TTCAGACAGG AACGTTTCAG 

451 GAGCATTATC TTTTATCTCA GCCCGGAATT TGTGTCCGTT CTTTTGATAG 

501 CACCTTGGAG GTGATTATGG AAGTTCGTTA TGGGAAATCT CCGGTTGCCG 

551 TTCTAGAACC CTCGGTAGGA CGTGTCGTTC TTAAAGACTT CCCTAATCTT 

601 GTTGCAACAA GATTAGAGCT CCCTCCTGAA TGTTGGGTGT TGGGCTGTGG 

651 TCTCGGCGTA GCTAAAGATC GTCCTGAAGA AATACAAACG ATTCAACAAG 

7 01 CGATTACAGA TTTAAAGAGC GAAGGGGTGA TTCAATCTTT AACCAAGAAA 

751 TGGCAACTTT CTGAAGTTGC TTACGAATAG 

The PSORT algorithm predicts an inner membrane location (0.083). 

The protein was expressed in E.coli and purified as a his-tag product and as a GST-fusion product. 
The purified his-tag product is shown in Figure 37A. The recombinant his-tag protein was used to 
immunise mice, whose sera were used in a Western blot (Figure 37B) and for FACS analysis (Figure 
37C). The GST-fusion was also used in a Western blot (Figure 37D). 

The cp6767 protein was also identified in the 2D-PAGE experiment and showed good 
cross-reactivity with human sera, including sera from patients with pneumonitis. 

These experiments show that cp6767 is a useful immunogen. These properties are not evident from 
the sequence alone. 

Example 38 

The following C.pneumoniae protein (pid 4376717) was expressed <SEQ ID 75; cp6717>: 

1 MMSRLRFRLA ALGIFFILLV PNSVSAKT IV ASDKEKVGVL VYDNSVEAFQ 

51 QILDCIDHAN FYVELCPCMT GGRTLKEMVD HLEARMDLVP BLCSYIIIQP 

101 TFTDAEDQKL LKALKERHPN RFFYVFTGCP PSTSILAPNV IEMHIKLSII 

151 DGKYCILGGT NFEEFMCTPG DEVPEKVDNP RLFVSGVRRP LAFRDQDIML 

201 RSTAFGLQLR EEYHKQFAMW DYYAHHMWFI DNPEQFAGAC PPLTLEQAEE 

251 TVFPGFDKHE DLVLVDSSKI RIVLGGPHDK QPNPVTQEYL KLIQGARSSV 

301 KLAHMYFIPK DELLNALVDV SHNHGVHLSL ITNGCHELSP AITGPYAWGN 

351 RINYFALLYG KRYPliWKKWF CEKLKPYERV SIYEFAIWET QLHKKCMI ID 

401 DEIFVIGSYN FGKKSDAFDY ESIWIESPE VAAKANKVFN KDIGLSIPVS 

451 HGDIFSWYFH SVHHTLGHLQ LTYMPA* 

A predicted signal peptide is highlighted. 



The cp6717 nucleotide sequence <SEQ ID 76> is: 
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1 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 



ATGATGAGTC 
TTTGCTGGTT 
AGGAGAAGGT 
CAGATATTGG 
CTGCATGACA 
CTCGTATGGA 
ACGTTTACCG 
TCATCCCAAC 
GCATCCTCGC 
GATGGGAAAT 
CACTCCAGGG 
TCAGTGGAGT 
CGTTCTACAG 
TGCTATGTGG 
AACAGTTTGC 
ACAGTATTTC 
TTCCAAGATC 
CTGTGACTCA 
AAGCTTGCTC 
TGTCGACGTT 
GCTGTCATGA 
CGTATTAACT 
AAAATGG TTT 
AGTTTGC TAT 
GATGAAATTT 
CTTTGATTAC 
AAGCTAACAA 
CATGGCGACA 
ACATTTGCAG 



GGTTGCGTTT 
CCTAATTCTG 
TGGAGTTCTT 
ATTGCATAGA 
GGAGGCCGAA 
TCTGGTTCCA 
ATGCTGAAGA 
CGGTTTTTCT 
TCCTAATGTC 
ATTGTATTTT 
GATGAGGTTC 
GCGTCGGCCC 
CATTCGGTTT 
GACTACTATG 
AGGCGCCTGT 
CTGGATTTGA 
AGGATAGTTT 
AGAATATTTG 
ACATGTATTT 
TCTCATAATC 
ATTAAGTCCT 
ATTTCGCCTT 
TGCGAAAAGC 
TTGGGAAACG 
TTGTGATCGG 
GAAAGTATTG 
AGTCTTCAAT 
TTTTCTCTTG 
CTGACCTATA 



TCGCTTGGCA 
TTTCAGCAAA 
GTTTATGACA 
TCATGCAAAT 
CGCTTAAAGA 
GAGCTCTGTA 
CCAAAAATTA 
ACGTTTTTAC 
ATTGAAATGC 
AGGTGGTACC 
CTGAGAAAGT 
CTAGCATTTC 
GCAGCTCAGA 
CACATCATAT 
CCTCCACTGA 
CAAACATGAA 
TAGGTGGTCC 
AAACTTATCC 
CATCCCTAAG 
ACGGTGTTCA 
GCAATTACAG 
GCTCTATGGG 
TAAAACCTTA 
CAGTTGCACA 
AAGTTATAAT 
TAGTTATCGA 
AAAGATATCG 
GTATTTC CAT 
TGCCAGCCTA 



GCTCTTGGAA 
GACAATCGTA 
ATAGTGTAGA 
TTTTATGTAG 
GATGGTAGAT 
GCTATATCAT 
CTCAAAGCTC 
AGGGTGCCCA 
ATATCAAACT 
AATTTTGAAG 
GGATAACCCA 
GTGATCAGGA 
GAAGAATATC 
GTGGTTCATT 
CTTTAGAACA 
GATCTTGTTC 
CCACGATAAG 
AGGGAGCTAG 
GACGAGCTTT 
TCTGAGTTTA 
GACCCTATGC 
AAACGGTATC 
TGAGCGGGTT 
AGAAGTGTAT 
TTTGGAAAGA 
ATCTCCAGAA 
GATTGTCGAT 
TCCGTACACC 
G 



TATTTTTTAT 
GCTTCAGACA 
GGCCTTTCAA 
AACTGTGTCC 
CACCTCGAGG 
TATCCAACCC 
TCAAAGAACG 
CCCTCAACAA 
TTCTATCATC 
AGTTTATGTG 
CGTTTATTTG 
TATCATGTTG 
ATAAGCAATT 
GATAATCCTG 
AGCCGAGGAG 
TTGTCGACTC 
CAACCCAATC 
ATCTTCTGTG 
TAAATGCTCT 
ATTACGAACG 
TTGGGGAAAC 
CTCTTTGGAA 
TCTATTTATG 
GATTATCGAT 
AAAGTGATGC 
GTCGCTGCAA 
TCCTGTAAGT 
ACACTTTGGG 



The PSORT algorithm predicts a periplasmic location (0.939). 

The protein was expressed in E.coli and purified as a GST-fusion (Figure 38 A), as a his-tagged 
protein, and as a GST/his fusion product. The proteins were used to immunise mice, whose sera were 
used in a Western blot (Figure 38B) and for FACS analysis. 

These experiments show that cp6717 is a useful immunogen. These properties are not evident from 
the sequence alone. 



Example 39 

The following C.pneumohiae protein (pid 4376577) was expressed <SEQ ID 77; cp6577>: 

1 MKKXiLFSTFIi LVLGSTSAAH ANLGYVNLKR CLEESDLGKK ETEELEAMKQ 

51 QFVKNAEKIE EELTSIYNKL QDEDYMESLS DSASEELRKK FEDIiSGEYNA 

101 YQSQYYQSIN QSNVKRIQKLr IQEVKIAAES VRSKEKLEA I LNEEAVLAIA 

151 PGTDKTTE 1 1 AILNESFKKQ N* 

A predicted signal peptide is highlighted. 



The cp6577 nucleotide sequence <SEQ ID 78> is: 

1 ATGAAAAAAT TATTATTTTC TACATTTCTT CTTGTTTTAG GATCAACAAG 

51 CGCAGCTCAT GCAAATTTAG GCTATGTTAA TTTAAAGCGA TGTCTTGAAG 

101 AATCCGATCT AGGTAAAAAG GAAACTGAAG AATTGGAAGC TATGAAACAG 

151 CAGTTTGTAA AAAATGCTGA GAAAATAGAA GAAGAACTCA CTTCTATTTA 

201 TAATAAGTTG CAAGATGAAG ATTACATGGA AAGCCTATCG GATTCTGCCT 

251 CTGAAGAGTT GCGAAAGAAA TTCGAAGATC TTTCAGGAGA GTACAATGCG 

301 TACCAGTCTC AGTACTATCA ATCTATCAAT CAAAGTAATG TAAAACGCAT 

351 TCAAAAACTC ATTCAAGAAG TAAAAATAGC TGCAGAATCA GTGCGGTC C A 

401 AAGAAAAACT AGAAGCTATC CTTAATGAAG AAGCTGTCTT AGCAATAGCA 

451 CCTGGGACTG ATAAAACAAC CGAAATTATT GCTATTCTTA ACGAATCTTT 

501 CAAAAAACAA AACTAG 

The PSORT algorithm predicts a periplasmic space location (0.932). 



WO 02/02606 



PCT/IB01/01445 



-79- 

The protein was expressed in E.coli and purified as a his-tag product (Figure 39 A) and as a GST- 
fusion product (Figure 39B). The recombinant GST-fusion protein was used to immunise mice, 
whose sera were used in a Western blot (Figure 39C) and for FACS analysis. 

The cp6577 protein was also identified in the 2D-PAGE experiment. 

These experiments show that cp6577 is a useful immunogen. These properties are not evident from 
the sequence alone. 

Example 40 

The following C.pneumoniae protein (pid 4376446) was expressed <SEQ ID 79; cp6446>: 

1 MKQPMSI«IFS SVCIiGLGIiGS LSSCNQKPSW NYHNT ST SEE FFVHGNKSVS 

51 QLPHYPSAFR TTQIFSEEHN DPYWAKTDE ESRKIWREIH KNLKIKGSYI 

101 PISTYGSLMH PKSAALTLKT YRPHPIWING YERSFNIDTG KYLKNGSRRR 

151 TSHDGPKNRA VLNLIKSSGR RCNAIGLEMT EEDFVIARRR EGVYSL.YPVE 

201 VCSYPQGNPF VIAYAWIADE SACSKEVLPV KGYYSLVWES VSSSDSLNAF 

251 GDSFAEDYLR STFLANGTSI LCVHESYKKV PPQP* 

A predicted signal peptide is highlighted. 

The cp6446 nucleotide sequence <SEQ ID 80> is: 

1 ATGAAACAGC CCATGTCTCT TATCTTTTCA AGTGTATGTT TAGGATTAGG 
51 TCTTGGATCT CTTTCCTCCT GTAATCAAAA GCCCTCTTGG AATTATCACA 
101 ACACTTCAAC GAGCGAAGAA TTCTTTGTTC ATGGAAATAA GAGTGTTTCG 
151 CAACTGCCTC ATTATCCTTC TGCATTTCGT ACGACTCAAA TCTTTTCTGA 
201 AGAGCACAAT GATCCTTATG TCGTAGCTAA GACTGATGAA GAGTCTCGTA 
251 AAATTTGGAG AGAAATCCAT AAAAATCTCA AAATCAAAGG TTCTTACATT 
301 CCCATATCGA CTTATGGAAG TCTGATGCAC CCAAAATCAG CAGCTCTTAC 
351 ATTAAAAACG TATCGTCCAC ATCCTATTTG GATAAATGGA TACGAGCGTT 
401 CTTTTAATAT AGACACAGGA AAGTACTTAA AAAACGGAAG TCGCCGTAGA 
451 ACTTCTCACG ATGGTCCGAA AAATCGAGCT GTACTGAATC TCATTAAATC 
501 TTCGGGACGA CGCTGTAATG CTATAGGCCT TGAGATGACA GAAGAAGACT 
551 TTGTAATAGC TAGAAGGCGA GAAGGTGTTT ATAGCCTGTA TCCCGTTGAA 
601 GTGTGCTCGT ATCCTCAGGG GAATCCTTTT GTCATTGCTT ATGCCTGGAT 
651 TGCAGATGAG AGTGCTTGCT CAAAAGAGGT CCTACCTGTA AAAGGGTACT 
701 ATTCTTTAGT CTGGGAAAGC GTTTCTTCCT CTGATTCTCT GAATGCTTTT 
751 GGAGATTCCT TTGCAGAGGA CTACCTCAGA AGCACGTTTT TAGCAAACGG 
801 AACTTCTATA CTCTGTGTTC ATGAAAGCTA TAAGAAAGTT CCTCCTCAGC 
851 CCTAA 

The PSORT algorithm predicts an inner membrane location (0.177). 

The protein was expressed in Exoli and purified as a his-tag product and a GST-fusion product. The 
GST-fusion product is shown in Figure 40 A. The recombinant his-tag protein was used to immunise 
mice, whose sera were used in a Western blot (Figure 40B) and for FACS analysis. 

These experiments show that cp6446 is a useful immunogen. These properties are not evident from 
the sequence alone. 

Example 41 

The following C.pneumoniae protein (pid 4377108) was expressed <SEQ ID 81; cp7108>: 

1 MSKKIKVTX5H LTLCTIiFRGV LCAAALSNIG YASTSQESPY QKSIEDWKGY 

51 TFTDLELIiSK BGWSEAHAVS GNGSRIVGAS GAGQGSVTAV IWESHLIKHL 

101 GTLGGEA S SA EGISKDGEW VGWSDTREGY THAFVFDGRD MKDLGTLGAT 

151 YSVARGVSGD GSIIVGVSAT ARGEDYGWQV GVKWEKGKIK QLKLL PQGLW 
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201 SEANAISEDG TVIVGRGEIS 

251 SANGKVIVGW STTNNGETHA 

301 IVGFSAVKTG EIHAFYYAEG 

351 TDAGAERAYL FHIHK* 



RNHIVAVKWN KNAVYSLGTL GGSVASAEAI 
FMHKDETMHD LGTLGGGFSV ATGVSADGRA 
EMEDLTTliGG EEARVFDISS EGNDIIGSIK 



A predicted signal peptide is highlighted. 



The cp7108 nucleotide sequence <SEQ ID 



82> is: 



1 ATGAGTAAGA AGATAAAGGT TCTAGGTCAT TTGACGCTCT GCACTCTGTT 

51 TAGAGGAGTG CTGTGTGCAG CGGCCCTTTC CAACATAGGA TATGCGAGTA 

101 CTTCTCAGGA ATCACCATAT CAGAAGTCTA TAGAAGACTG GAAAGGGTAT 

151 ACCTTTACAG ATCTTGAGTT ACTGAGTAAG GAAGGGTGGT CTGAAGC TC A 

201 TGCAGTTTCT GGAAATGGCA GTAGAATTGT AGGAGCTTCG GGAGCTGGCC 

251 AAGGTAGTGT GACTGCTGTC ATATGGGAAA GTCACCTGAT AAAACATCTC 

301 GGCACTTTAG GTGGCGAGGC TTCATCTGCA GAGGGAATTT CAAAGGATGG 

351 AGAGGTGGTC GTTGGGTGGT CAGATACTAG AGAGGGATAT ACTCATGCCT 

401 TTGTCTTCGA CGGTAGAGAT ATGAAAGATC TCGGTACTCT AGGAGCT AC C 

451 TATTCTGTAG CAAGGGGTGT TTCTGGAGAT GGTAGTATCA TCGTAGGAGT 

501 CTCTGCAACT GCTCGTGGAG AGGATTACGG ATGGCAAGTT GGTGTCAAGT 

551 GGGAAAAAGG GAAAATCAAA CAATTGAAGT TGTTGC CTC A AGGTCTCTGG 

601 TCTGAGGCGA ATGCAATCTC TGAGGATGGT ACGGTGATTG TCGGGAGAGG 

651 GGAAATCTCT CGCAATCACA TCGTTGCTGT AAAATGGAAT AAAAATGCTG 

701 TGTATAGTTT GGGGACTCTC GGAGGTAGTG TCGCTTCAGC AGAGGCTATA 

751 TCGGCAAATG GGAAAGTAAT TGTAGGATGG TCCACGACTA ATAATGGTGA 

801 GACTCATGCC TTTATGCACA AAGATGAGAC AATGCACGAT CTCGGCACTC 

851 TAGGAGGAGG TTTTTCTGTC GCAACTGGAG TTTCTGCTGA TGGGAGAGCC 

901 ATCGTAGGAT TTTCAGCAGT GAAGACCGGA GAAATTCATG CTTTTTACTA 

951 TGCAGAAGGA GAAATGGAGG ATTTAACAAC TTTGGGAGGG GAAGAAGCTC 

1001 GAGTGTTCGA CATATCTAGC GAAGGAAACG ATATCATTGG CTCTATAAAA 

1051 ACTGACGCTG GAGCTGAACG CGCCTATCTG TTCCATATAC ATAAATAA 



The PSORT algorithm predicts an outer membrane location (0.921). 

The protein was expressed in Kcoli and purified as a GST-fusion product, as shown in Figure 41A. 
The recombinant protein was used to immunise mice, whose sera were used in a Western blot 
(Figure 41B) and for FACS analysis (Figure 41C). A his-tagged protein was also expressed. 

The cp7108 protein was also identified in the 2D-PAGE experiment. 

These experiments show that cp7108 is a surface-exposed and inumunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 42 

The following C, pneumoniae protein (pid 4377287) was expressed <SEQ ID 83; cp7287>: 



1 MVAKKTVRSY RS SF3H3VI V AIIiSAGIAFE AHSL HSSELD LGVFNKQFEE 

51 HSAHVEEAQT SVLKGSDPVN PSQKESEKVL YTQVPLTQGS SGESIiDLADA 

101 NFLEHFQKLF EETTVFGIDQ KLVWSDLDTR NFSQPTQEPD TSNAVSEKIS 

151 SDTKENRKDL ETEDPSKKSG LKEVSSDLPK SPETAVAAIS EDLEISENIS 

201 ARDPLQGLAF FYKNTSSQSI SEKDSSFQGI IFSGSGANSG LGFENL.KAPK 

251 SGAAVYS DRD XVFENLVKGL SFISCESLED GSAAGVNIW THCGDVTLTD 

301 C ATGLDLE ALi KLVKDFSRGG AVFTARNHEV QNNLAGGI L S WGNKGAIW 

351 EKNSAEKSNG GAFACGS FVY SNNENTALWK ENQALSGGAI SSASDIDIQG 

401 NCSAIEFSGN QSLIALGEHI GLTDFVGGGA LAAQGTLTLR NNAWQCVKN 

451 TSKTHGGAIL AGTVDLNETI SEVAFKQNTA ALTGGALSAN DKVIIANNFG 

501 EILFEQNEVR NHGGAI YCGC RSNPKLEQKD SGENINIIGN SGAITFLKNK 

551 ASVLEVMTQA EDYAGGGALW GHNVLLDSNS GNIQFIGNIG GSTFWIGEYV 

601 GGGAILSTDR VTISNNSGDV VFKGNKGQCL AQKYVAPQET APVESDASST 

651 NKDEKSLNAC SHGDHYPPKT VEEEVPPSLL EEHPWS STD IRGGGAILAQ 

701 HIFITDNTGN LRF SGNLGGG EESSTVGDLA IVGGGALLST NEVNVCSNQN 

751 WFSDNVTSN GCDSGGAILA KKVDISANHS VEFVSNGSGK FGGAVCALNE 

801 SVNITDNGSA VSFSKNRTRL GGAGVAAPQG SVTICGNQGN IAFKENFVFG 
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851 SENQRSGGGA IIANSSVNIQ DNAGDILFVS NSTGSYGGAI FVGSLVASEG 

901 SNPRTLTITG NSGDILFAKN STQTAASLSE KDSFGGGAIY TQNLKIVKNA 

951 GNVSFYGNRA PSGAGVQIAD GGTVCLEAFG GDIIjFEGNIN FDGSFNAIHL 

1001 CGNDSKIVEL SAVQDKNIIF QDAITYEENT IRGIiPDKDVS PLSAPSLIFN 

1051 SKPQDDSAQH HEGTIRFSRG VSKIPQIAAI QEGTLALSQN AELWIiAGLKQ 

1101 ETGSSIVLSA GSILRIFDSQ VDSSAPLPTE NKEETLVSAG VQINMSSPTP 

1151 NKDKAVDT P V LADIISITVD LSSFVPEQDG TLPLPPEIII PKGTKLHSNA 

1201 IDLKIIDPTN VGYENHALLS SHKDIPLISL KTAEGMTGTP TADASLSNIK 

1251 IDVSLPSITP AT YGHTGWJ S ESKMEDGRLV VGWQPTGYKL, NPEKQGALVL 

1301 NNLWSHYTDL RALKQE I F AH HTIAQRMELD FSTNVWGSGL GWEDCQNIG 

1351 EFDGFKHHLT GYALGLDTQL VEDFLIGGCF SQFFGKTESQ SYKAKNDVKS 

1401 YMGAAYAGIL AGPWLIKGAF VYGNINNDIiT TDYGTLGIST GSWIGKGFIA 

1451 GTSIDYRYIV NPRRFISAIV STWPFVEAE YVRIDLPEIS EQGKEVRTFQ 

1501 KTRFENVAIP FGFAIiEHAYS RG SRAEVNS V QLAYVFDVYR KGPVSLITLK 

1551 DAAYSWKSYG VDIPCKAWKA RLSNNTEWNS YLSTYLAFNY EWREDL I A YD 

1601 FNGGIRIIF* 

A predicted signal peptide is highlighted. 

The cp7287 nucleotide sequence <SEQ ID 84> is: 

1 ATGGTAGCGA AAAAAACAGT ACGATCTTAT AGGTCTTCAT TTTCTCATTC 

51 CGTAATAGTA GCAATATTGT CAGCAGGCAT TGCTTTTGAA GCACATTCCT 

101 TACACAGCTC AGAACTAGAT TTAGGTGTAT TCAATAAACA GTTTGAGGAA 

151 CATTCTGCTC ATGTTGAAGA GGCTCAAACA TCTGTTTTAA AGGGATCAGA 

2 01 TCCTGTAAAT CCCTCTCAGA AAGAATCCGA GAAGGTTTTG TACACTCAAG 
251 TGCCTCTTAC CCAAGGAAGC TCTGGAGAGA GTTTGGATCT CGCCGATGCT 

3 01 AATTTCTTAG AGCATTTTCA GCATCTTTTT GAAGAGACTA CAGTATTTGG 
351 TATCGATCAA AAGCTGGTTT GGTCAGATTT AGATACTAGG AATTTTTCCC 
401 AACCCACTCA AGAACCTGAT ACAAGTAATG CTGTAAGTGA GAAAATCTCC 
451 TCAGATACCA AAGAGAATAG AAAAGACCTA GAGACTGAAG ATCCTTCAAA 
501 AAAAAGTGGC CTTAAAGAAG TTTCATCAGA TCTCCCTAAA AGTCCTGAAA 
551 CTGCAGTAGC AGCTATTTCT GAAGATCTTG AAATCTCAGA AAACATTTCA 
601 GCAAGAGATC CTCTTCAGGG TTTAGCATTT TTTTATAAAA ATACATCTTC 
651 TCAGTCTATC TCTGAAAAGG ATTCTTCATT TCAAGGAATT ATCTTTTCTG 
701 GTTCAGGAGC TAATTCAGGG CTAGGTTTTG AAAATCTTAA GGCGCCGAAA 
751 TCTGGGGCTG CAGTTTATTC TGATCGAGAT ATTGTTTTTG AAAATCTTGT 
801 TAAAGGATTG AGTTTTATAT CTTGTGAATC TTTAGAAGAT GGCTCTGCCG 
851 CAGGTGTAAA CATTGTTGTG ACCCATTCTG GTGATGTAAC TCTCACTGAT 
901 TGTGCCACTG GTTTAGACCT TGAAGCTTTA CGTCTGGTTA AAGATTTTTC 
951 TCGTGGAGGA GCTGTTTTCA CTGCTCGCAA CCATGAAGTG CAAAATAACC 

1001 TTGCAGGTGG AATTCTATCC GTTGTAGGCA ATAAAGGAGC TATTGTTGTA 

1051 GAGAAAAATA GTGCTGAGAA GTCCAATGGA GGAGCTTTTG CTTGCGGAAG 

1101 TTTTGTTTAC AGTAACAACG AAAACACCGC CTTGTGGAAA GAAAATCAAG 

1151 CATTATCAGG AGGAGC CAT A TCCTCAGCAA GTGATATTGA TATTCAAGGG 

1201 AACTGTAGCG CTATTGAATT TTCAGGAAAC CAGTCTCTAA TTGCTCTTGG 

1251 AGAGCATATA GGGCTTACAG ATTTTGTAGG TGGAGGAGCT TTAGCTGCTC 

1301 AAGGGACGCT TACCTTAAGA AATAATGCAG T AGTG CAATG TGTTAAAAAC 

1351 ACTTCTAAAA CACATGGTGG AGCTATTTTA GCAGGTACTG TTGATCTCAA 

1401 CGAAACAATT AGCGAAGTTG CCTTTAAGCA GAATACAGCA GCTCTAACTG 

1451 GAGGTGCTTT AAGTGCAAAT GATAAGGTTA TAATTGCAAA TAACTTTGGA 

1501 GAAATTCTTT TTGAGCAAAA CGAAGTGAGG AATCACGGAG GAGCCATTTA 

1551 TTGTGGATGT CGATCTAATC CTAAGTTAGA ACAAAAGGAT TCTGGAGAGA 

1601 ACATCAATAT TATTGGAAAC TCCGGAGCTA TCACTTTTTT AAAAAATAAG 

1651 GCTTCTGTTT TAGAAGTGAT GACACAAGCT GAAGATTATG CTGGTGGAGG 

1701 CGCTTTATGG GGGCATAATG TTCTTCTAGA TTCCAATAGT GGGAATATTC 

1751 AATTTATAGG AAATATAGGT GGAAGTACCT TCTGGATAGG AGAATATGTC 

1801 GGTGGTGGTG CGATTCTCTC TACTGATAGA GTGACAATTT CTAATAACTC 

1851 TGGAGATGTT GTTTTTAAAG GAAACAAAGG CCAATGTCTT GCTCAAAAAT 

1901 ATGTAGCTCC TCAAGAAACA GCTCCCGTGG AATCAGATGC TTCATCTACA 

1951 AATAAAGACG AGAAGAGCCT TAATGCTTGT AGTCATGGAG ATCATTATCC 

2001 TCCTAAAACT GTAGAAGAGG AAGTGCCACC TTCATTGTTA GAAGAACATC 

2051 CTGTTGTTTC TTCGACAGAT ATTCGTGGTG GTGGGGCCAT TCTAGCTCAA 

2101 CATATCTTTA TTACAGATAA TACAGGAAAT CTGAGATTCT CTGGGAACCT 

2151 TGGTGGTGGT GAAGAGTCTT CTACTGTCGG TGATTTAGCT ATCGTAGGAG 

2201 GAGGTGCTTT GCTTTCTACT AATGAAGTTA ATGTTTGCAG TAACCAAAAT 

2251 GTTGTTTTTT CTGATAACGT GACTTCAAAT GGTTGTGATT CAGGGGGAGC 

2301 TATTTTAGCT AAAAAAGTAG ATATCTCCGC GAACCACTCG GTTGAATTTG 
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2351 
2401 
2451 
2501 
2551 
2601 
2651 
2701 
2751 
2801 
2851 
2901 
2951 
3001 
3051 
3101 
3151 
3201 
3251 
3301 
3351 
3401 
3451 
3501 
3551 
3601 
3651 
3701 
3751 
3801 
3851 
3901 
3951 
4001 
4051 
4101 
4151 
4201 
4251 
4301 
4351 
4401 
4451 
4501 
4551 
4601 
4651 
4701 
4751 
4801 



TCTCTAATGG 
TCAGTAAACA 
AACACGTCTT 
TTTGTGGAAA 
TCTGAAAATC 
AAATATTCAG 
GATCTTATGG 
AGCAACCCAC 
TGCTAAAAAT 
TTGGTGGAGG 
GGGAACGTTT 
AATTGCAGAC 
TATTTGAAGG 
TGCGGGAATG 
TATTATTTTC 
TGCCAGATAA 
TCCAAGCCAC 
TTCTCGAGGG 
CCTTAGCTTT 
GAAACAGGAA 
TGATTCCCAG 
AGACTCTTGT 
AATAAAGATA 
TACTGTAGAT 
TTC CTCCTGA 
ATAGATCTTA 
TCTTCTAAGT 
AAGGAATGAC 
ATAGATGTAT 
AGTTTGGTCT 
AACCTACGGG 
AATAATCTCT 
CTTTGCTCAT 
ATGTCTGGGG 
GAGTTTGATG 
TACACAACTA 
TTGGTAAAAC 
TATATGGGAG 
AGGAGCTTTT 
GTACTTTAGG 
GGCACAAGCA 
GGCAATCGTA 
TAGATCTTCC 
AAAACTCGTT 
TGCTTATTCG 
ACGTCTTTGA 
GATGCTGCTT 
TTGGAAGGCT 
CGTATTTAGC 
TTCAATGGTG 



TTCAGGGAAA 

TTACGGACAA 

GGCGGTGCTG 

TCAGGGAAAC 

AAAGATCAGG 

GATAACGCAG 

AGGTGCTATT 

GAACGCTTAC 

AGCACGCAAA 

GGCCATCTAT 

CTTTCTATGG 

GGAGGAACTG 

GAATATCAAT 

ACTCAAAAAT 

CAAGATGCAA 

AGATGTCAGT 

AAGATGACAG 

GTATCTAAAA 

ATCACAAAAC 

GTTCTATCGT 

GTTGATAGCA 

TTCTGCCGGA 

AAGCTGTAGA 

TTGTCTTCAT 

AATTATCATT 

AGATTATAGA 

TCTCATAAAG 

AGGGACGCCT 

CTTTACCTTC 

GAAAGTAAAA 

ATATAAGTTA 

GGAGTCATTA 

CATACGATAG 

AT C AGGATTA 

GGTTCAAACA 

GTTGAAGACT 

TGAAAGCCAA 

CTGCTTATGC 

GTTTACGGTA 

TATTTCAACA 

TTGATTACCG 

TCCACAGTGG 

AGAAATTAGC 

TTGAGAATGT 

CGTGGCTCAC 

TGTATATCGT 

ATTCTTGGAA 

CGCTTGAGCA 

GTTTAATTAT 

GTATCCGTAT 



TTCGGTGGTG 

TGGCTCGGCA 

GAGTTGCAGC 

ATAGCATTTA 

TGGAGGAGCT 

GAGATATCCT 

TTTGTAGGAT 

AATTACAGGC 

CAGCCGCTTC 

ACACAAAACC 

CAACAGAGCT 

TTTGTTTAGA 

TTTGATGGGA 

CGTAGAGCTT 

TTACTTATGA 

CCTTTAAGTG 

CGCTCAACAT 

TTCCTCAGAT 

GCAGAGCTTT 

ATTGTCTGCG 

GTGCGCCTCT 

GTTCAAATTA 

TACTCCAGTA 

TTGTTCCTGA 

CCTAAGGGAA 

TCCTACCAAT 

ATATTCCATT 

ACAGCAGATG 

GATCACACCA 

TGGAAGATGG 

AATCCTGAGA 

TACAGATCTT 

CTCAAAGAAT 

GGTGTTGTTG 

TCATCTCACA 

TCTTAATTGG 

TCCTACAAAG 

GGGGATTTTA 

ATATAAACAA 

GGTTCATGGA 

CTATATTGTA 

TTCCTTTTGT 

GAACAGGGTA 

CGCCATTCCT 

GTGCTGAAGT 

AAGGGACCTG 

GAGTTATGGG 

ATAATACGGA 

GAATGGAGAG 

TATTTTCTAG 



CCGTTTGCGC 

GTATCATTCT 

TCCTCAAGGC 

AAGAGAACTT 

ATCATTGCTA 

ATTTGTAAGT 

CTTTGGTTGC 

AACAGTGGGG 

TTTATCAGAA 

TCAAAATTGT 

CCTAGTGGTG 

GGCTTTTGGA 

GTTTCAATGC 

TCTGCTGTTC 

AGAGAACACA 

CCCCTTCATT 

CATGAAGGGA 

TGCTGCTATA 

GGTTGGCAGG 

GGATCTATTC 

TCCTACAGAA 

ACATGAGCTC 

CTTGCAGATA 

GCAAGACGGA 

CAAAATTACA 

GTGGGATATG 

AATTTCTCTT 

CTTCTCTATC 

GCAACGTATG 

AAGACTTGTA 

AGCAAGGGGC 

AGAGCTCTTA 

GGAGTTAGAT 

AAGATTGTCA 

GGGTATGCCC 

AGGATGTTTC 

CTAAGAACGA 

GCAGGTCCTT 

CGATTTGACT 

TAGGAAAAGG 

AATCCTCGAC 

AGAAGCCGAG 

AAGAGGTTAG 

TTTGGATTTG 

GAACAGTGTA 

TCTCTTTGAT 

GTAGATATTC 

ATGGAATTCA 

AAGATCTGAT 



TTTAAACGAA 

CTAAAAATAG 

TCTGTAACGA 

TGTTTTTGGC 

ACTCTTCTGT 

AACTCTACGG 

TTCTGAAGGC 

ATATCCTATT 

AAAGATTCCT 

AAAGAATGCA 

CTGGTGTCCA 

GGAGATATCT 

GATTCACTTA 

AAGATAAAAA 

ATTCGTGGCT 

AATTTTTAAC 

CGATACGGTT 

CAAGAGGGAA 

ACTTAAACAG 

TCCGTATTTT 

AATAAAGAGG 

TCCTACACCC 

TCATAAGTAT 

ACTCTTCCTC 

TTCTAATGCC 

AAAATCATGC 

AAGACAGCGG 

TAATATAAAA 

GTCACACAGG 

GTCGGTTGGC 

TCTAGTTTTG 

AGCAGGAGAT 

TTCTCGACAA 

GAACATCGGA 

TAGGC TTGGA 

TCACAGTTCT 

TGTGAAGAGT 

GGTTAATAAA 

ACAGATTACG 

GTTTATCGCA 

GGTTTATATC 

TATGTCCGTA 

AACGTTCCAA 

CTTTAGAACA 

CAGCTTGCTT 

TACACTCAAG 

CTTGTAAAGC 

TATTTAAGTA 

AGCTTATGAC 



The PSORT algorithm predicts an inner membrane location (0.106). 

The protein was expressed in Kcoli and purified as a GST-fusion product, as shown in Figure 42A. 
The recombinant protein was used to immunise mice, whose sera were used in a Western blot 
(Figure 42B) and for FACS analysis (Figure 42C). A his-tagged protein was also expressed. 

The cp7287 protein was also identified in the 2D-PAGE experiment and showed good 
cross-reactivity with human sera, including sera from patients with pneumonitis. 

These experiments show that cp7287 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 
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Example 43 

The following C.pneumoniae protein (PID 4377305) was expressed <SEQ ID 85; cp7105>: 

1 MSLYQKWWNS QLKKStiCYST VAALIFMIPS QESFADSLID LNLGLDPSVE 

51 CLSGDGAFSV GYFTKAGSTP VEYQPFKYDV SKKTFTILSV ETANQSGYAY 

101 GISYDGTITV GTCSLGAGKY NGAKWSADGT L.TPLTGITGG TSHTEARAIS 

151 KDTQVIEGFS YDASGQPKAV QWASGATTVT QLADISGGSR SSYAYAI SDD 

201 GTIIVGSMES TITRKTTAVK WVNNVPTYLG TLGGDASTGL YISGDGTVIV 

251 GAANTATVTN GNQE SHAYMY KDNQMKD* 

The cp7105 nucleotide sequence <SEQ ID 86> is: 

1 GTGAGTCTAT ATCAAAAATG GTGGAACAGT CAGTTAAAGA AGAGCCTCTG 

51 CTATTCGACT GTTGCTGCTC TAATATTTAT GATTCCTTCT CAAGAATCCT 

101 TTGCAGATAG TCTTATAGAT TTAAATTTAG GTTTAGATCC TTCGGTCGAA 

151 TGTCTGTCAG GAGATGGTGC ATTTTCTGTT GGGTATTTTA CTAAGGCGGG 

201 ATCGACTCCC GTAGAATATC AGCCGTTTAA ATACGACGTA TCTAAGAAGA 

251 CATTCACAAT CCTTTCCGTA GAAACGGCAA ATCAGAGCGG CTATGCTTAC 

301 GGAATCTCCT ACGATGGCAC GATCACTGTA GGAACGTGTA GCCTAGGTGC 

351 AGGAAAATAT AACGGCGCAA AATGGAGTGC GGATGGCACT TTAACACCCT 

401 TAACTGGAAT CACGGGGGGG ACGTCACATA CGGAAGCGCG TGCGATTTCT 

451 AAGGATACTC AGGTGATCGA GGGT TTCTC A TATGATGCTT CAGGGCAACC 

501 CAAGGCTGTG CAGTGGGCAA GCGGAGCGAC TACAGTAACA CAATTAGCAG 

551 ATATTTCAGG AGGCTCTAGA AGCTCTTATG CGTATGCTAT ATCTGATGAT 

601 GGCACGATTA TTGTTGGGTC TATGGAGAGC ACGATAACAA GGAAAACTAC 

651 AGCTGTAAAA TGGGTAAATA ATGTTCCTAC GTATCTGGGA ACCTTAGGAG 

701 GAGATGCTTC TACAGGTCTT TATATTTCTG GAGACGGCAC CGTGATTGTA 

751 GGTGCGGCAA ATACAGCAAC TGTAACCAAT GGGAATCAGG AATCCCACGC 

801 CTATATGTAT AAAGATAACC AAATGAAAGA TTGA 

The PSORT algorithm predicts an inner membrane location (0.100). 

The protein was expressed in Kcoli and purified as a GST-fusion product, as shown in Figure 43A. 
The recombinant protein was used to immunise mice, whose sera were used in a Western blot 
(Figure 43B) and for FACS analysis (Figure 43C). A his-tagged protein was also expressed. 

This protein also showed good cross-reactivity with human sera, including sera from patients with 
pneumonitis. 

These experiments show that cp7105 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 44 

The following C.pneumoniae protein (pid 4376802) was expressed <SEQ ID 87; cp6802>: 

1 MSNQLQPCIS LGC VSYINSF PLSLQLIKRN DIRCVLAPPA DLLNLLIEGK 

51 LDVAiiTS SLG AISHNLGYVP GFGIAANQRI LSVNLiYAAPT FFNSPQPRIA 

101 ATIiESRSSIG IiLKVLCRHLW RIPTPHILRF ITTKVLRQTP ENYDGLLLIG 

151 DAALQHPVLP GFVTYDLASG WYDLTKLPFV FAKLLHSTSW KEHPLPNLAM 

201 EEAIiQQFESS PEEVLKEAHQ HTGL.PPSLLQ EYYALCQYRL GEEHYESFEK 

251 FREYYGTLYQ QARL* 



A predicted signal peptide is highlighted. 

The cp6802 nucleotide sequence <SEQ ID 88> is: 



1 ATGTCTAACC AACTCCAGCC ATGTATAAGC TTAGGCTGCG TAAGTTATAT 

51 TAATTCCTTT CCGCTGTCCC TACAACTCAT AAAAAGAAAC GATATTCGCT 

101 GTGTTCTTGC TCCCCCTGCA GACCTCCTCA ACTTGCTAAT CGAAGGGAAA 

151 CTCGATGTTG CTTTGACCTC ATCCCTAGGA GCTATCTCTC ATAACTTGGG 

201 GTATGTCCCC GGCTTTGGAA TTGCAGCAAA CCAACGTATC CTCAGTGTAA 
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251 ACCTCTATGC AGCTCCCACT TTCTTTAACT CACCGCAACC TCGGATTGCC 

301 GCAACTTTAG AAAGTCGCTC CTCTATAGGA CTCTTAAAAG TGCTTTGTCG 

351 TCATCTCTGG CGCATCCCAA CTCCTCATAT CCTAAGATTC ATAACTACAA 

401 AAGTACTCAG ACAAACCCCT GAAAATTATG ATGGCCTCCT CCTAATCGGA 

451 GATGCAGCGC TACAACATCC TGTACTTCCT GGATTTGTAA CCTATGACCT 

501 TGC CTCGGGG TGGTATGATC TTACAAAGCT ACCTTTTGTA TTTGCTCTTC 

551 TTCTACACAG CACCTCTTGG AAAGAACATC CCCTACCCAA CCTTGCGATG 

601 GAAGAAGCCC TCCAACAGTT CGAATCTTCA CCCGAAGAAG TCCTTAAAGA 

651 AGCTCATCAA CATACAGGTC TGCCCCCTTC TCTTCTTCAA GAATACTATG 

701 CCCTATGCCA GTACCGTCTA GGAGAAGAAC ACTACGAAAG CTTTGAAAAA 

751 TTCCGGGAAT ATTATGGAAC CCTCTACCAA CAAGCCCGAC TGTAA 



The PSORT algorithm predicts an inner membrane location (0.060). 

The protein was expressed in E.coli and purified as a GST-fiision product, as shown in Figure 44A. 
The recombinant protein was used to immunise mice, whose sera were used in a Western blot 
(Figure 44B) and for FACS analysis (Figure 44C). A his-tagged protein was also expressed. 

These experiments show that cp6802 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 45 

The following C.pneumoniae protein (pid 4376390) was expressed <SEQ ID 89; cp6390>: 



1 MVF S YYCMGIi FFFSGAISSC GLLVSLGVGL GLSVLGVLLL LLAGLLLFKI 

51 QSMLREVPKA PDLLDLEDAS ERLRVKASRS LASIjPKE I SQ LESYIRSAAN 

101 DLNTIKTWPH KDQRLVETVS RKLERLAAAQ NYMISELCEI SEILEEEEHH 

151 LIIiAQESIiEW IGKSIiFSTFL DMESFLNIiSH LSEVRPYLAV NDPRLIjEITE 

201 ESWEWSHFI NVTSAFKKAQ ILFKNNEHSR MKKKLESVQE LLETFIYKSL 

251 KRSYRELGCL SEKMRIIHDN PLFPWVQDQQ KYAHAKNEFG EIARCLEEFE 

301 KTFFWLDEEC AISYMDCWDF LNESIQNKKS RVDRDY I STK KIALKDRART 

351 YAKVLIiEENP TTEGKIDLQD AQRAFERQSQ EFYTLEHTET KVRLEALQQC 

401 FSDLREATNV RQVRFTNSEN ANDLKESFEK IDKERVRYQK EQRIjYWETID 

451 RNEQELREEI GESLRLQNRR KGYRAGYDAG RLKGLLRQWK KNLRDVEAKL 

501 EDATMDFEHE VSKSELCSVR ARLEVLEEELi MDMS PKVAD I EELLSYEERC 

551 ILPIRENLER AYLQYNKCSE ILSKAKFFFP EDEQLLVSEA NLREVGAQLK 

601 QVQGKCQERA QKFAI FEKH I QEQKSIilKEQ VRSFDLAGVG FL.KSELLSIA 

651 CNL.YIKAWK ESIPVDVPCM QLYYSYYEDN EAWRNRLLN MTERYQNFKR 

701 SLNSIQFNGD VLLRDPVYQP EGHETRIiKER ELQETTL SCK KIiKVAQDRL S 

751 ELESRLSRR 



1 TTGGTATTCT CATACTATTG CATGGGATTA TTTTTTTTCT CTGGAGCTAT 

51 TTCTAGTTGT GGTCTTTTAG TGTCTCTAGG AGTTGGTTTA GGACTTAGTG 

101 TTTTAGGAGT ACTTTTACTT CTCTTAGCAG GTCTTTTGCT TTTTAAGATC 

151 CAAAGTATGC TTCGAGAGGT GCCTAAGGCT CCTGATCTAT TAGATTTAGA 

201 AGATGCAAGT GAACGGCTTA GAGTAAAGGC TAGCCGTTCT TTAGCAAGCC 

251 TCC CGAAGGA AATCAGTCAG C TAG AGAGCT ACATTCGTTC TGCAGCTAAT 

301 GATCTAAATA CAATTAAGAC TTGGCCGCAT AAAGATCAAA GACTCGTCGA 

351 GAC CGTGTC A CGAAAATTAG AGCGTCTGGC AGCTGCTCAA AACTATATGA 

401 TTTCTGAACT CTGCGAGATT AGTGAGATTC TTGAGGAAGA GGAGCATCAT 

451 CTAATTTTGG CTCAGGAATC TCTAGAATGG ATAGGTAAGA GTCTATTTTC 

501 TACCTTTCTG GACATGGAAT CTTTTTTAAA TTTGAGCCAT CTATCTGAAG 

551 TGCGTCCGTA CTTAGCTGTA AATGATCCTA GATTATTAGA AATTACCGAA 

601 GAATCTTGGG AAGTAGTGAG TCATTTCATA AATGTAACGT CTGCTTTTAA 

651 GAAAGCTCAG ATTCTTTTTA AGAACAACGA ACATTCTCGG ATGAAGAAGA 

701 AGTTAGAAAG TGTTCAAGAG TTACTGGAAA CATTTATTTA TAAGAGTTTA 

751 AAGAGAAGTT ATCGAGAATT AGGATGCTTA AGTGAAAAGA TGAGAATCAT 

801 TCACGACAAT CCTCTCTTCC CTTGGGTGCA AGATCAGCAG AAGTATGCTC 

851 ATGCTAAGAA TGAATTTGGA GAGATTGCGC GGTGTTTAGA GGAGTTTGAA 

901 AAGACGTTCT TCTGGTTGGA TGAGGAGTGT GCTATTTCTT ACATGGACTG 



A predicted signal peptide is highlighted. 



The cp6390 nucleotide sequence <SEQ ID 



90>is: 
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951 TTGGGATTTT CTAAATGAGT CTATTCAGAA TAAGAAGTCC AGAGTAGATC 

1001 GAGATTATAT ATCCACGAAG AAAATTGCAT TAAAGGATAG AGCCCGCACT 

1051 TATGCTAAGG TTCTTTTAGA AGAGAATCCG ACTACAGAGG GTAAAATAGA 

1101 TTTGCAAGAC GCTCAAAGAG CCTTTGAGCG TCAAAGTCAG GAGTTTTATA 

1151 CACTAGAGCA TACGGAAACA AAGGTGAGAC TAGAAGCACT TCAACAGTGC 

12 01 TTCTCGGATC TTAGGGAGGC GACGAACGTA AGGCAAGTTA GGTTTACAAA 
1251 TTCTGAAAAT GCGAATGATT TAAAGGAGAG TTTCGAGAAG ATAGATAAAG 

13 01 AGCGTGTGCG ATATCAAAAA GAGCAAAGGC TCTATTGGGA AACAATAGAT 
1351 CGCAATGAGC AAGAGCTTAG GGAAGAGATT GGGGAGTCGC TTCGTTTACA 
1401 AAATCGGAGA AAAGGGTATA GGGCTGGATA TGATGCTGGG CGTTTAAAAG 
1451 GTTTGTTGCG TCAGTGGAAG AAAAATCTCC GCGATGTGGA AGCCCACCTT 
1501 GAAGATGCAA CTATGGATTT TGAGCATGAA GTAAGCAAGA GCGAATTGTG 
1551 CAGTGTTCGG GCGAGGCTCG AGGTTCTAGA AGAAGAGCTG ATGGATATGT 
1601 CTCCTAAAGT TGCGGATATA GAAGAGTTGT TGTCCTATGA AGAGCGTTGT 
1651 ATTCTTCCTA TTAGGGAAAA TTTAGAAAGG GCATACCTCC AATATAATAA 
1701 GTGTTCTGAA ATTTTATCCA AGGCAAAGTT CTTCTTTCCG GAAGACGAGC 
1751 AATTGCTAGT TTCGGAAGCG AATCTAAGAG AGGTGGGTGC CCAGTTAAAA 
1801 CAAGTACAGG GAAAATGTCA AGAGAGGGCC CAAAAGTTCG CAATATTTGA 
1851 AAAGCATATT CAGGAGCAGA AAAGCCTTAT TAAAGAGCAA GTGCGGAGTT 
1901 TTGATCTAGC GGGAGTTGGG TTTTTAAAGA GTGAGCTTCT TAGTATTGCT 
1951 TGTAACCTTT ATATAAAGGC GGTTGTTAAG GAGTCTATAC CAGTTGATGT 
2001 GCCTTGTATG CAGTTATATT ATAGTTATTA CGAAGATAAT GAAGCTGTAG 
2051 TGCGAAACCG CCTTTTAAAT ATGACGGAGA GGTATCAAAA TTTTAAAAGG 
2101 AGTTTGAATT CCATACAATT TAATGGTGAC GTTCTTTTAC GGGATCCGGT 
2151 CTATCAACCT GAAGGTCATG AGACCAGGCT AAAGGAACGG GAGCTACAAG 
22 01 AAACAACTTT GTCTTGTAAG AAATTAAAAG TGGCTCAAGA TCGTCTTTCT 
2251 GAATTAGAGT CAAGGCTGTC TAGGAGATAG 

The PSORT algorithm predicts a periplasmic location (0.932). 

The protein was expressed in Exoli and purified as a GST-fusion product, as shown in Figure 45 A. 
The recombinant protein was used to immunise mice, whose sera were used in a Western blot 
(Figure 45B) and for FACS analysis (Figure 45C). A his-tagged protein was also expressed. 

This protein also showed good cross-reactivity with human sera, including sera from patients with 
pneumonitis. 

These experiments show that cp6390 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 46 

The following C.pneumoniae protein (pid 4376272) was expressed <SEQ ID 91; cp6272>: 

1 MKRCFIiFLAS FVU4GSSADA LTHQEAVKKK NSYLSHFKSV SGIVTIEDGV 

51 IiNIHNNLRIQ ANKVYVENTV GQSLKLVAHG NVMVNYRAKT LVCDYLEYYE 

101 DTDSCLLTNG RFAMYPWFLG GSMITLTPET IVIRKGYIST SEGPKKDLCI* 

151 SGDYliEYSSD SUj S IGKTTL RVCRIPILFL PPFSIMPMEI PKPPINFRGG 

201 TGGFLGSYLG MSYSPISRKH FSSTFFIiDSF FKHGVGMGFN LHCSQKQVPE 

251 NVFNMKSYYA HRLAIDMAEA HDRYRLHGDF CFTHKHVNFS GEYHLSDSWE 

301 TVADI FPNNF MLKNTGPTRV DCTWNDNYFE GYLTSSVKVN SFQNANQELP 

351 YIjTLRQYPIS IYNTGVYUEN IVECGYLNFA FSDHIVGENF SSLRLAARPK 

401 LHKTVPLP IG TLSSTIiGSSL IYYSDVPEIS SRHSQLSAKL QLDYRFLLHK 

451 SYIQRRHIIE PFVTF ITETR PLAKNEDHYI FSIQDAFHSI* NLLKAGIDTS 

501 VIiSKTNPRFP RIHAKIjVJTTH ILSNTESKPT FPKTACELSI, PFGKKNTVSL 

551 DAEWIWKKHC WDHMNIRWEW IGNDNVAMTL ESLHRSKYSIi IKCDRENFIL 

601 DVSRPIDQIjL DSPLSDHRNIi ILGKLFVRPH PCWNYRLSLR YGWHRQDTPN 

651 YLEYQM ILGT KIFEHWQLYG VYERREADSR FFFFLKLDKP KKPPF* 



A predicted signal peptide is highlighted. 

The cp6272 nucleotide sequence <SEQ ID 92> is: 



1 ATGAAACGTT G CTTCTT ATT TCTAGCTTCC TTTGTTCTTA TGGGTTCCTC 
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10 



15 



20 



25 



30 



35 



40 



51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 
2051 



AGCTGATGCT 
TTAGTCACTT 
TTGAATATCC 
AAATACTGTG 
TGAACTATAG 
GATACAGACT 
GTTTCTAGGG 
GGAAGGGATA 
TC C GG AG ATT 
GACAACATTA 
CTATCATGCC 
ACAGGAGGAT 
TAGGAAGCAT 
GCGTCGGCAT 
AATGTCTTCA 
GGCAGAAGCT 
ATAAGCATGT 
ACTGTTGCTG 
CACACGTGTC 
CCTCTTCTGT 
TATTTAACAT 
CCTTGAAAAC 
ATATCGTTGG 
CTCCATAAAA 
GAGTTCTCTG 
GTCAGCTTTC 
TCCTACATTC 
AGAGACTCGT 
AAGATGCCTT 
GTACTGAGTA 
GACTACCCAC 
CTGCATGCGA 
GATGCTGAAT 
TTGGGAGTGG 
ATAGAAGCAA 
GATGTCAGCC 
TAGGAATCTC 
ATTACCGCTT 
TACCTAGAAT 
GCTCTATGGG 
TCTTAAAGCT 



TTGACTCATC 
TAAGAGTGTT 
ATAACAACCT 
GGTCAAAGCC 
GGCAAAAACC 
CTTGTCTTCT 
GGGTCTATGA 
TATCTCTACC 
ACCTGGAATA 
AGGGTGTGTC 
TATGGAGATC 
TTCTGGGATC 
TTCTCCTCGA 
GGGATTCAAC 
ATATGAAAAG 
CATGATCGCT 
AAATTTTTCT 
ACATTTTCCC 
GATTGCACTT 
TAAGGTAAAC 
TAAGGCAGTA 
ATCGTAGAAT 
CGAGAATTTC 
CTGTGCCTCT 
ATTTACTATA 
CGCGAAGCTA 
AAAGACGCCA 
CCTCTAGCTA 
TCACTCCTTA 
AGACTAACCC 
ATCTTGAGCA 
GCTATCTCTA 
GGATTTGGAA 
ATCGGAAATG 
ATACAGCCTG 
GTCCCATTGA 
ATTTTAGGGA 
ATCCTTACGC 
ACCAGATGAT 
GTGTATGAAC 
CGACAAACCT 
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AAGAGGCTGT 
TCTGGGATTG 
GCGGATACAA 
TGAAGCTTGT 
CTAGTTTGTG 
TAC TAATGGA 
TCACTCTAAC 
TCCGAGGGTC 
TTCTTCAGAT 
GCATTCCGAT 
CCTAAGCCTC 
CTATTTGGGG 
CATTTTTCTT 
CTCCATTGTT 
CTATTATGCC 
ATCGCCTACA 
GGAGAATACC 
CAACAACTTC 
GGAATGACAA 
TCTTTCCAAA 
CCCGATTTCT 
GTGGGTATTT 
TCTTCACTAC 
ACC TATAGGA 
GCGATGTTCC 
CAACTTGATT 
TATTATAGAG 
AGAATGAAGA 
AAC CTTCTGA 
TCGATTCCCG 
ATACAGAAAG 
CCTTTTGGAA 
AAAGCACTGT 
ACAATGTGGC 
ATTAAGTGTG 
CCAGCTTTTA 
AATTATTTGT 
TATGGCTGGC 
TCTAGGGACG 
GCCGAGAAGC 
AAAAAACCTC 



GAAAAAGAAA 
TGACCATCGA 
GCCAATAAAG 
CGCACATGGC 
ATTACCTAGA 
AGATTCGCGA 
CCCAGAAACC 
CCAAAAAAGA 
AGTCTTCTTT 
ACTTTTCTTA 
CGATAAACTT 
ATGAGCTACT 
GGATAGCTTT 
CTCAGAAGCA 
CACCGCCTTG 
CGGAGATTTC 
ATCTCAGCGA 
ATGTTGAAAA 
CTATTTTGAA 
ATGCCAAC C A 
ATTTATAATA 
AAACTTTGCT 
GTCTTGCTGC 
ACGCTCTCCT 
TGAGATCTCC 
ATCGCTTTCT 
CCGTTCGTTA 
TCATTATATC 
AAGCGGGTAT 
AGAATC CATG 
CAAACCCACG 
AGAAAAATAC 
TGGGATCACA 
TATGACTCTA 
ACAGGGAGAA 
GACTCCCCTC 
ACGACCTCAT 
ATCGCCAGGA 
AAGATCTTCG 
AGATAGTCGA 
CCTTCTAA 



AACTCCTATC 
AGATGGGGTA 
TGTATGTAGA 
AATGTTATGG 
GTATTACGAA 
TGTATCCTTG 
ATAGTCATTC 
CCTGTGCCTC 
CTATAGGGAA 
CCTCCATTTT 
TCGAGGAGGA 
CGCCGATTTC 
TTCAAGCATG 
GGTTCCTGAG 
CTATCGATAT 
TGCTTCACGC 
TAGTTGGGAA 
ATACAGGCCC 
GGGTATCTCA 
AGAGCTCCCT 
CGGGAGTGTA 
TTTAGCGATC 
GCGCCCTAAG 
CCACCCTAGG 
TCGCGCCATA 
ATTACATAAG 
CCTTCATTAC 
TTTTCTATTC 
AGATACCTCG 
CGAAGCTGTG 
TTTCCCAAAA 
AGTCTCCTTA 
TGAACATACG 
GAATCCCTGC 
CTTCATTTTA 
TCTCTGATCA 
CCCTGTTGGA 
CACTCCGAAC 
AACATTGGCA 
TTTTTCTTCT 
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The PSORT algorithm predicts an outer membrane location (0.48). 

The protein was expressed in E.coli and purified as a GST-fusion product, as shown in Figure 46 A. 
The recombinant protein was used to immunise mice, whose sera were used in a Western blot and for 
FACS analysis (Figure 46B). A his- tagged protein was also expressed. 

This protein also showed good cross-reactivity with human sera, including sera from patients with 
pneumonitis. 

These experiments show that cp6272 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 



50 Example 47 

The following C.pneumoniae protein (pid 4377111) was expressed <SEQ ID 93; cp71 1 1>: 

1 MFEAVTADIQ AREILDSRGY PTLHVKVTTS TGSVGEARVP SGASTGKKEA 

51 LEFRDTDS PR YQGKGVLQAV KNVKE ILFPIi VKGCSVYEQS LIDSLMMDSD 

101 GSPNKETLGA NAILGVSLAT AHAAAATLRR PLYRYLGGCF ACSLPCPMMN 

55 151 LINGGMHADN GLEFQEFMIR P I GAS S IKEA VNMGADVFHT LKKLLHERGL 

201 STGVGDEGGF APNLASNEEA LELLLLAIEK AGFTPGKDIS LALDCAASSF 
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251 YNVKTGTYDG RHYEEQIAIL SNLCDRYPID SIEDGLAEED YDGWALLTEV 

301 LGEKVQIVGD DLFVTNPELI LEG X SNGLAN SVLIKPNQIG TLTETVYAIK 

351 LAQMAGYTT I I SHRSGETTD TTIADLAVAF NAGQIKTGSL SRSERVAKYN 

401 RIiMEIEEELG SEA IFTDSNV FSYEDSEE* 

A predicted signal peptide is highlighted. 



The cp71 1 1 nucleotide sequence <SEQ ID 94> is: 



1 ATGTTTGAAG CTGTCATTGC CGATATCCAG GCTAGGGAAA TCTTGGATTC 

51 TCGCGGGTAT CCCACTTTAC ATGTTAAAGT AACCACTAGC ACAGGTTCTG 

101 TTGGAGAAGC TCGGGTTCCT TCAGGAGCAT CCACAGGGAA AAAAGAAGCC 

151 TTAGAGTTTC GTGATACAGA TTCTCCTCGT TATCAAGGCA AAGGGGTTTT 

201 GCAAGCTGTA AAAAACGTAA AAGAAATTCT TTTTCCCCTC GTCAAGGGAT 

251 GTAGTGTTTA TGAGCAATCC TTAATTGATT CTCTGATGAT GGATTCTGAC 

301 GGCTCTCCGA ACAAAGAAAC TCTAGGGGCC AATGCTATTT TAGGAGTCTC 

3 51 TCTAGCTACA GCACATGCAG CAGCAGCAAC ACTACGCAGA CCTCTGTATC 

401 GTTATTTAGG AGGGTGTTTT GCCTGCAGTC TTCCCTGTCC TATGATGAAT 

451 CTGATCAATG GAGGCATGCA TGCCGATAAC GGCTTGGAGT TCCAAGAATT 

501 TATGATCCGT CCTATTGGAG CCTCTTCCAT CAAAGAAGCT GTCAACATGG 

551 GTGCTGACGT TTTTC AT AC T TTGAAAAAAT TACTCCATGA AAGAGGCTTA 

601 TCTACTGGAG TGGGTGACGA AGGAGGCTTC GCCCCGAATC TTGCTTCTAA 

651 TGAAGAAGCT CTAGAGCTCC TATTGCTGGC TATTGAAAAA GCAGGCTTTA 

701 CTCCAGGAAA AGATATATCG CTAGC CTTAG ACTGCGCAGC ATCCTCATTC 

751 TATAACGTAA AAACAGGCAC GTATGATGGG AGGCACTATG AAGAGCAAAT 

801 CGCAATCCTT TCTAATTTAT GTGATCGCTA TCCTATAGAC TCCATAGAAG 

851 ATGGTCTTGC TGAAGAAGAC TATGACGGGT GGGCCTTGTT AACTGAAGTT 

901 C TTGGAG AAA AAGTACAGAT TGTGGGTGAT GACCTATTTG TTACAAATCC 

951 GGAATTAATA TTAGAGGGTA TTAGCAATGG ATTAGCGAAC TCTGTGTTGA 

1001 TTAAACCAAA TCAGATAGGG ACGCTTACTG AAACAGTGTA TGCTATCAAG 

1051 CTTGCGCAAA TGGCTGGCTA TACTACAATT ATTTCTCATC GCTCAGGAGA 

1101 AACTACGGAC ACTACGATTG CAGATCTTGC TGTTGCCTTC AACGC C GGTC 

1151 AAATCAAAAC AGGCTCTTTA TCACGTTCTG AGCGTGTTGC AAAATACAAT 

1201 AGACTCATGG AAATTGAAGA AGAGCTTGGA TCCGAAGCAA TTTTCACAGA 

1251 TTCTAATGTA TTTTCTTAC GAGGATTCT GAGGAATAG 

The PSORT algorithm predicts an inner membrane location (0.100). 

The protein was expressed in E.coli and purified as a GST-fusion product, as shown in Figure 47 A. 
The recombinant protein was used to immunise mice, whose sera were used in a Western blot 
(Figure 47B) and for FACS analysis (Figure 47C). A his-tagged protein was also expressed. 

The cp7111 protein was also identified in the 2D-PAGE experiment and showed good 
cross-reactivity with human sera, including sera from patients with pneumonitis. 

These experiments show that cp7111 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 



Example 48 

The following C.pneumoniae protein (pid 4455886) was expressed <SEQ ID 95; cp0010>: 

1 MKSQFSWIiVIi SSTLACFTSC STVFAATAEN IGPSDSFDGS TNTGTYTPKN 

51 TTTGIDYTLT GDITLQNLGD SAALTKGCFS DTTESLSFAG KGYSLSFLNI 

101 KSSAEGAALS VTTDKNLSLT GFSSLTFLAA PSSVITTPSG KGAVKCGGDL 

151 TFDNNGTILF KQDYCEENGG AISTKNLSLK NSTGSISFEG NKSSATGKKG 

201 GAICATGTVD ITNNTAPTLF SNNIAEAAGG AINSTGNCTI TGNTSLVFSE 

251 NSVTATAGNG GAL SG DADVT ISGNQSVTFS GNQAVANGGA IYAKKLTLAS 

301 GGGGVSPFLT IIVQGTTAGN GGAISILAAG ECSLSAEAGD I TFNGNAJ VA 

351 TTPQTTKRNS IDIGSTAKIT NLRAI SGHS X FFYDPITANT AADSTDTLNL 

401 NKADAGNSTD YSGSIVFSGE KLSEDEAKVA DNLTSTLKQP VTLTAGNIiVL 

451 KRGVTLDTKG FTQT AG S SVX MDAGTTLKAS TEEVTLTGLS IPVDSLGEGK 

501 KWIAASAAS KNVALSGPIL LLDNQGNAYE NHDLGKTQDF SFVQLSALGT 
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551 ATTTDVPAVP TVATPTHYGY QGTWGMTWVD 

601 LPNPERQGPL VPNSLWGSFS DIQAIQGVIE 

651 LDKDKKGEKR KYRHKSGGYA IGGAAQTCSE 

701 KNHTDTYAGA FYIQHITECS GFIGCLLDKL 

5 751 SNDLKTKYTA YPEVKGSWGN NAFNMMLGAS 

801 NI/TYIRQDSF SEKGTEGRSF DDSNLFNLSL 

851 LSYVPDLIRN DPKCTTALVI SGASWETYAN 

901 FEVLGQFVFE VRGSSRIYNV DLGGKFQF* 

A predicted signal peptide is highlighted. 
10 The cpOOlO nucleotide sequence <SEQ ID 96> is: 



DTASTPKTKT 
RSALTLCSDR 
NLiISFAFCQL 
PGSWSHKPLV 
SHSYPEYliHC 
PIGVKFEKFS 
NLARQAIiQVR 



ATLAWTNTGY 
GFWAAGVANF 
FGSDKDFLVA 
LEGQLAYSHV 
FDTYAPYIKL 
DCNDFSYDLT 
AGSHYAFSPM 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



1 


ATGAAATCGC 


51 


TACTAGTTGT 


101 


CTGATAGCTT 


151 


ACGACTACTG 


201 


CCTTGGGGAT 


251 


AATCTTTAAG 


301 


AAGTCTAGTG 


351 


GTCGCTAACA 


401 


TAATCACAAC 


451 


ACATTTGATA 


501 


AAATGGCGGA 


551 


GATCGATTTC 


601 


GGGGCTATTT 


651 


TACCCTCTTC 


701 


GCACAGGAAA 


751 


AATAGTGTGA 


801 


CGATGTTACC 


851 


CTGTAGCTAA 


901 


GGGGGGGGGG 


951 


TGCAGGTAAT 


1001 


TTTCAGCAGA 


1051 


ACTACACCAC 


1101 


AAAGATCACG 


1151 


ATCCGATTAC 


1201 


AATAAGGCTG 


1251 


TTCTGGTGAA 


1301 


CTTCTACGCT 


1351 


AAACGTGGTG 


1401 


CTCTGTTATT 


1451 


TCACTTTAAC 


1501 


AAAGTTGTAA 


1551 


TCCGATTCTT 


1601 


TAGGAAAAAC 


1651 


GCAACAACTA 


1701 


CTATGGGTAT 


1751 


GCACTCCAAA 


1801 


CTTCCGAATC 


1851 


ATCTTTTTCA 


1901 


TGACTCTTTG 


1951 


TTAGATAAAG 


2001 


TGGATATGCT 


2051 


GCTTTGCCTT 


2101 


AAAAATCATA 


2151 


AGAATGTAGT 


2201 


GGAGTCATAA 


2251 


AGTAATGATC 


2301 


TTGGGGGAAT 


2351 


ATCCTGAATA 


2401 


AATCTGACCT 


2451 


AAGATCTTTT 


2501 


TGAAGTTTGA 


2551 


TTATCCTATG 


2601 


ACTTGTAATC 


2651 


GACAGGCCTT 


2701 


TTTGAAGTGC 



AATTTTCCTG 
TCCACTGTTT 
TGACGGAAGT 
GAATAGACTA 
TCGGCAGCTT 
CTTTGCCGGT 
CTGAAGGCGC 
GGATTTTCGA 
CCCCTCAGGA 
ACAATGGAAC 
GCCATTTCTA 
TTTTGAAGGG 
GTGCTACTGG 
TCGAACAATA 
CTGTACAATT 
CAGCGACCGC 
ATATCTGGGA 
TGGCGGAGCC 
GGGTATCTCC 
GGTGGAGCCA 
AGCAGGGGAC 
AAACTACAAA 
AATTTACGTG 
TGCTAATACG 
ATGCAGGTAA 
AAGCTCTCTG 
GAAGCAGCCT 
TCACTCTCGA 
ATGGATGCGG 
AGGTCTTTCC 
TTGCTGCTTC 
CTTTTGGATA 
TCAAGACTTT 
CAGATGTTCC 
CAAGGTACTT 
GACTAAGACA 
CTGAGCGTCA 
GACATCCAAG 
TTCAGATCGA 
ATAAGAAAGG 
ATCGGAGGTG 
TTGCCAACTC 
CTGATACCTA 
GGGTTCATAG 
ACCCCTCGTT 
TGAAGACAAA 
AATGCTTTTA 
CCTGCATTGT 
ATATACGTCA 
GATGACAGCA 
GAAGTTCTCT 
TTCCTGATCT 
AGCGGAGCCT 
GCAAGTGCGT 
TCGGCCAGTT 



GTTAGTGCTC 
TTGCTG C AAC 
ACTAACACAG 
TACTCTGACA 
TAACGAAGGG 
AAGGGGTACT 
AGCACTTTCT 
GTCTTACTTT 
AAAGGTGCAG 
TATTTTATTT 
CCAAGAATCT 
AATAAATCGA 
TACTGTAGAT 
TTGCTGAAGC 
ACAGGGAATA 
AGGAAATGGA 
ATCAGAGTGT 
ATTTATGCTA 
TTTTCTAACA 
TTTCTATACT 
ATTACCTTCA 
AAGAAATTCT 
CAATATCTGG 
GCTGCGGATT 
TAGTACAGAT 
AAGATGAAGC 
GTAACTCTAA 
TACGAAAGGC 
GCACAACGTT 
ATTCCTGTAG 
TGCAGCAAGT 
ACCAAGGGAA 
TCATTTCTGC 
AGCGGTTCCT 
GGGGAATGAC 
GCGACATTAG 
AGGACCTTTA 
CGATTCAAGG 
GGCTTCTGGG 
GGAAAAACGC 
CAGCGCAAAC 
TTTGGTAGCG 
TGCAGGAGCC 
GTTGTCTCTT 
TTAGAAGGGC 
GTATACTGCG 
ACATGATGTT 
TTTGAT AC CT 
GGACAGCTTC 
ACCTCTTCAA 
GATTGTAATG 
TATCCGCAAT 
CTTGGGAAAC 
GCAGGCAGTC 
TGTCTTTGAA 



TCTTCGACAT 
TGCTGAAAAT 
GCACCTATAC 
GGAGATATAA 
TTGTTTTTCT 
CACTTTCTTT 
GTTACAACTG 
CTTAGCGGCC 
TTAAATGTGG 
AAACAAGATT 
TTCTTTGAAA 
GCGCAACAGG 
ATTACAAATA 
TGCAGGTGGA 
CGTCTCTTGT 
GGAGCTCTTT 
AACTTTCTCA 
AG AAG CTT AC 
ATAaTAGTCC 
GGCAGCTGGA 
ATGGGAATGC 
ATTGACATAG 
GCATAGCATC 
CTACAGATAC 
TATAGTGGGT 
AAAAGTTGCA 
CTGCAGGAAA 
TTTACTCAGA 
AAAAGCAAGT 
ACTCTTTAGG 
AAAAATGTAG 
TGCTTATGAA 
AGCTCTCTGC 
ACAGTAGCAA 
TTGGGTTGAT 
CTTGGAC CAA 
GTTCCTAATA 
TGTCATAGAG 
CTGCGGGAGT 
AAATACCGTC 
TTGTTCTGAA 
ATAAAGATTT 
TTCTATATCC 
AGATAAACTT 
AGCTCGCTTA 
TATCCTGAGG 
GGGAGCTTCT 
ATGCTCCATA 
TCGGAGAAAG 
TTTATCTTTG 
ACTTTTCTTA 
GATCCCAAAT 
TTATGCCAAT 
ACTACGCCTT 
GTTCGTGGAT 



TGGCATGTTT 
ATAGGCCCCT 
TCCTAAAAAT 
CTCTGCAAAA 
GAC AC TACGG 
TTTAAATATT 
ATAAAAATCT 
CCATCATCGG 
AGGGGATCTT 
AC TGTGAGGA 
AACAGCACGG 
GAAAAAAGGT 
AT ACGGC TCC 
GCTATAAATA 
ATTTTCTGAA 
CTGGAGATGC 
GGAAACCAAG 
ACTGGCTTCC 
AAGGTACCAC 
GAGTGTAGTC 
CATTGTTGCA 
GATCT AC TGC 
TTTTTCTACG 
TTTAAATCTC 
CGATTGTTTT 
GACAACCTCA 
TTTAGTACTT 
CCGCGGGTTC 
ACAGAGGAGG 
CGAGGGTAAG 
CCCTTAGTGG 
AATCACGACT 
TCTGGGTACT 
CTCCTACGCA 
GATACCGCAA 
TACAGGCTAC 
GCCTTTGGGG 
AGAAGTGCTT 
CGCCAATTTC 
ATAAATCTGG 
AACTTAATTA 
CTTAGTCGCT 
AACACATTAC 
CCTGGCTCTT 
TAGCCACGTC 
TGAAAGGTTC 
TCTCATTCTT 
CATCAAACTG 
GTACAGAAGG 
CCTATAGGGG 
TGATCTGACT 
GCACTACAGC 
AAC TTAGC AC 
CTCTCCTATG 
CCTCACGGAT 
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2751 TTATAATGTA GATCTTGGGG GTAAGTTCCA ATTCTAG 

The PSORT algorithm predicts an outer membrane location (0.922). 

The protein was expressed in E.coli and purified as a GST-fusion product, as shown in Figure 48 A. 
The recombinant protein was used to immunise mice, whose sera were used in a Western blot 
(Figure 48B) and for FACS analysis (Figure 48C). A his-tagged protein was also expressed. 

The cpOOlO protein was also identified in the 2D-PAGE experiment and showed good 
cross-reactivity with human sera, including sera from patients with pneumonitis. 

These experiments show that cpOOlO is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 49 

The following C.pneumoniae protein (pid 4376296) was expressed <SEQ ID 97; cp6296>: 

1 MEEVSEYLQQ VENQLESCSK RLTKMETFAL GVRLEAKEEI ESIILSDWN 

51 RFEVLCRDIE DMLSRVEEIE RMLRMAELPL LPIKEALTKA FVQHNSCKEK 

101 LTKVEPYFKE S PAYI/TSEER LQSLNQTLQR AYKE SQKVSG LESEVRACRE 

151 QLKDQVRQFE TQGVSLIKEE ILFVTSTFRT KFSYHSFRLH VPCMRLYEEY 

2 01 YDDIDLERTR ARWMAMSERY RDAFQAFQEM LKEGLVEEAQ ALRETEYWIjY 

251 REERKSKKKH* 

The cp6296 nucleotide sequence <SEQ ID 98> is: 

1 ATGGAGGAGG TGTCTGAGTA TCTTCAGCAA GTAGAAAATC AGTTGGAATC 

51 CTGTTCCAAG CGATTAACCA AGATGGAAAC TTTTGCCTTA GGTGTGAGGT 

101 TGGAAGCTAA AGAAGAGATA GAGTCTATCA TACTTTCTGA TGTAGTGAAC 

151 CGTTTTGAGG TTTTATGTAG AGATATTGAA GATATGCTAT CTCGAGTCGA 

201 GGAGATAGAG CGGATGTTAC GTATGGCGGA GCTTCCTCTA CTTCCTATAA 

251 AAGAAGCGCT TACCAAGGCT TTTGTACAAC ATAACAGCTG TAAAGAGAAG 

301 TTAACCAAGG TAGAGCCTTA CTTTAAAGAG AGCCCTGCAT ATCTAACTAG 

351 TGAAGAGCGA TTGCAGAGTT TGAATCAGAC TTTACAACGT GCGTACAAAG 

401 AGTCCCAAAA GGTTTCAGGT TTAGAATCGG AAGTGAGAGC CTGTCGAGAG 

451 CAGCTTAAAG ATCAAGTAAG ACAGTTTGAA ACTCAAGGAG TGAGCTTGAT 

501 AAAAGAAGAG ATTCTCTTTG TGACTAGTAC CTTTAGAACT AAATTTAGCT 

551 ATCATTCATT TCGATTACAT GTTCCTTGCA TGAGGTTGTA TGAGGAGTAT 

601 TATGATGACA TTGATCTAGA GAGAACTCGA GCTCGATGGA TGGCGATGTC 

651 TGAGAGGTAT AGAGATGCTT TTCAGGCATT CCAGGAGATG TTGAAGGAAG 

701 GCCTAGTTGA AGAAGC TC AG GCTCTTAGAG AAACCGAGTA CTGGTTATAT 

751 CGAGAGGAGA GAAAGAGTAA AAAGAAACAT TGA 

The PSORT algorithm predicts a cytoplasmic location (0.523). 

The protein was expressed in E.coli and purified as a GST-fusion product, as shown in Figure 49 A. 
The recombinant protein was used to immunise mice, whose sera were used in a Western blot 
(Figure 49B) and for FACS analysis (Figure 49C). A his-tagged protein was also expressed. 

These experiments show that cp6296 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 50 

The following C.pneumoniae protein (pid 4376664) was expressed <SEQ ID 99; cp6664>: 

1 MVLFHAQASG RNRVKADAIV L.PFWHFKDAK NAASFEAEFE PSYLPALENF 
51 QGKTGEXELIi YSSPKAKEKR IVLLGLGKNE ELTSDWFQT YATL.TRVLRK 
101 AKCSTVNIIL PTI SELRLSA EEFLVGLSSG IIiSLNYDYPR YNKVDRNLET 
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151 PliSKVTVIGI VPKMABAIFR KEAAI FEGVY 

201 VALNLGKEFP SIDTKVLGKD AIAKEKMGLL 

251 RPKSKDHTVL IGKGVTFDSG GLDLKPGKSM 

301 AVL.ELPINVT GIIPATENAI DGASYKMGDV 

351 ILADAITYAL KYCKPTRIID FATLTGAMW 

401 LEASAETSEP LWRLPLVKKY DKTLHSDIAD 

451 FLEESSVAWA HLDIAGTAYH EKEEDRYPKY 

The cp6664 nucleotide sequence <SEQ ID 100> is: 



LTRDIiVNRNA 
LAVSKGSCVD 
LTMKEDMAGG 
YVGMSGLSVE 
SLGEEVAGFF 
MKNLGSNRAG 
ASGFGVRSIL 



DEITPKKLAE 
PHFIWRYQG 
ATVLGILSAL 
ICSTDAEGRIi 
SNNDVLAEDL 
AITAALFLQR 
YYLENSLSK* 



1 


w 4.UU ill inl 


51 


Tf2 P T 7i T 7Jk OTP 


J. VJ J- 


'-ill iVaivVov** 


1 ^ 1 




201 


fsn 7a a & a a can 


O ^1 


V- ±vjJ\ 1 vj 1 A « -L 






JjI 


P O T«TWTV"« nv^i/^ P 
VjV^ 1 1 lv. 1 


a ni 




451 


CCTCTTTCTA 


501 


TATCTTTAGG 


551 


ATCTTGTGAA 


601 


GTTGCTCTGA 


651 


GGGAAAAGAT 


701 


CCAAGGGTTC 


751 


CGTCCTAAGT 


801 


TGACTCTGGA 


851 


AAGAAGACAT 


901 


GCAGTTTTAG 


951 


GAATGCTATC 


1001 


TGTCGGGGCT 


1051 


ATCCTCGCTG 


1101 


TATTATAGAT 


1151 


AAGAGGTTGC 


1201 


TTAGAGGCGT 


1251 


TAAGAAGTAT 


1301 


TAGGCAGTAA 


1351 


TTTTTGGAAG 


1401 


TGCATATCAT 


1451 


TTGGTGTTCG 



TTCATGCTCA 
CTGCCCTTTT 
CGAGTTTGAA 
CCGGGGAGAT 
ATTGTCCTCT 
TTTCCAAACC 
CCACAGTCAA 
GAAGAATTCT 
CTACCCACGT 
AAGTCACGGT 
AAAGAAGCAG 
CAGGAATGCT 
ATCTGGGAAA 
GCCATCGCCA 
TTGTGTGGAT 
C TAAAGATC A 
GGTTTAGACC 
GGCAGGTGGG 
AGCTTCCTAT 
GATGGCGCCT 
TTCTGTTGAG 
ATGCGATTAC 
TTTGCAACTC 
AGGTTTCTTT 
CAGCCGAAAC 
GATAAAACAT 
CCGTGCAGGG 
AATCTTCGGT 
GAAAAAGAAG 
TTCTATTCTT 



AGCCTCTGGG 
GGCATTTTAA 
CCCTCGTATC 
TGAACTCCTT 
TAGGCTTAGG 
TATGCGACAC 
TATCATCTTA 
TAGTGGGGTT 
TATAATAAGG 
TATCGGTATC 
C C ATTTTCGA 
GATGAAATTA 
AGAGTTCCCT 
AAGAGAAAAT 
CCACACTTTA 
CACCGTCTTG 
TCAAGCCTGG 
GCTACAGTCC 
AAATGTCACG 
CCTATAAAAT 
ATTTGTAGTA 
ATATGCTTTA 
TAACAGGAGC 
TCCAATAACG 
CTCCGAGCCG 
TGCATTCTGA 
GCTATTACAG 
AGCTTGGGCA 
AAG AC CGTTA 
TATTACTTAG 



CGTAATCGTG 
GGATG CAAAA 
TCCCCGCTTT 
TATAGTAGTC 
GAAAAATGAA 
TAACTCGTGT 
CCTACAATTT 
GTCCTCAGGA 
TAGATCGTAA 
GTTCC CAAAA 
AGGCGTATAT 
CCCCTAAGAA 
AGTATTGATA 
GGGACTCCTA 
TCGTTGTCCG 
ATAGGGAAAG 
AAAATCCATG 
TCGGGATTCT 
GGGATCATTC 
GGGAGATGTC 
CCGATGCTGA 
AAATATTGTA 
TATGGTAGTC 
ATGTTTTAGC 
TTATGGAGAC 
TATTGCTGAT 
CAGCATTATT 
CATC TT GAT A 
TCCAAAATAT 
AAAATAGTCT 



TTAAGGCAGA 
AATGCAGCTT 
AGAAAACTTT 
CTAAAGCTAA 
GAGCTCACCT 
CTTACGTAAA 
CTGAATTGCG 
ATTTTGTCAT 
TCTTGAAACT 
TGGCGGATGC 
CTCACTCGAG 
ATTGGCAGAG 
CTAAGGTCTT 
TTGGC TGTTT 
TTATCAAGGA 
GGGTCACTTT 
CTTACTATGA 
CTCGGCGTTA 
CTGCTACAGA 
TATGTAGGAA 
GGGACGTCTT 
AACCGACACG 
TCTCTAGGAG 
TGAAGATCTT 
TTCCTCTAGT 
ATGAAAAATC 
CTTGCAGAGA 
TTGCAGGTAC 
GCTTCAGGTT 
TTCTAAGTAG 



The PSORT algorithm predicts an inner membrane location (0.268). 

The protein was expressed in E.coli and purified as a GST-fusion (Figure 50A), as a his-tagged 
protein, and as a GST/His fusion. The proteins were used to immunise mice, whose sera were used in 
Western blot Western blot (50B) and FACS (50C) analyses. 

The cp6664 protein was also identified in the 2D-PAGE experiment (Cpn0385) and showed good 
cross-reactivity with human sera, including sera from patients with pneumonitis. 

These experiments show that cp6664 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 



Example 51 

The following C.pneumoniae protein (pid 4376696) was expressed <SEQ ID 101; cp6696>: 

1 MTIilFVIIIV WCNAFLIKLC VIMGLQSRLQ HCIEVSQNSN FDSQVKQFIY 

51 ACQDKTLRQS VLKIFRYHPL LKIHDIARAV YLLMALEEGE DLGLSFLNVQ 

101 QYPSGAVELF S CGGF PWKGL PYPAEHAEFG LLLLQIAEFY EESQAYVSKM 

151 SHFQQALFDH QGSVFPSLWS QENSRLLKEK TTLSQSFLFQ LGMQIHPEYS 

201 LEDPALGFWM QRTRSSSAFV AASGCQSSLG AYSSGDVGVI AYGPCSGDIS 

251 DCYYFGCCGI AKEFVCQKSH QTTEISFIiTS TGKPHPRNTG FSYIiRDSYVH 

301 LPIRCKITIS DKQYRVHAAL AEATSAMTFS IFCKGKNCQV VDGPRLRSCS 
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351 
401 



LDSYKGPGND IMILGENDAI NIVSASPYME IFALQGKEKF WNADFLINIP 
YKEEGVMLIF EKKVT SEKGR FFTKMN* 



A predicted signal peptide is highlighted. 

The cp6696 nucleotide sequence <SEQ ID 102> is: 



1 


TTGACTCTAA 


51 


CAAATTGTGC 


101 


AAGTGTCCCA 


151 


GCGTGCCAAG 


201 


CCATCCTTTA 


251 


TGGCCTTAGA 


301 


CAGTACCCTT 


351 


GAAAGGATTA 


401 


TACAGATCGC 


451 


AGTCATTTTC 


501 


TCTCTGGAGC 


551 


GCCAATCGTT 


601 


CTTGAGGATC 


651 


C GC TTTTGTA 


701 


CAGGGGATGT 


751 


GATTGTTATT 


801 


AAAATC TCAC 


851 


CTCATCCCAG 


901 


CTGCCGATCC 


951 


CGCTGCGTTG 


1001 


AGGGGAAGAA 


1051 


CTAGATTCTT 


1101 


TGACGCAATC 


1151 


TGCAAGGCAA 


1201 


TACAAAGAAG 


1251 


GAAAGGAAGA 



TTTTTGTTAT 
GTGATAATGG 
GAATTCGAAC 
ATAAGACATT 
CTAAAAATTC 
AGAAGGCGAG 
CAGGTGCTGT 
CCTTATCCTG 
AGAGTTTTAT 
AACAGGCACT 
CAGGAGAACT 
TCTCTTCCAA 
CTGCACTAGG 
GCCGCTTCAG 
CGGTGTTATC 
ATTTTGGATG 
CAAACTACAG 
AAATACGGGA 
GCTGTAAGAT 
GCTGAGGCCA 
TTGTCAGGTT 
ATAAAGGTCC 
AACATTGTTT 
AGAAAAATTT 
AGGGCGTCAT 
TTCTTTACGA 



TATTATCGTT 
GGCTGCAATC 
TTTGATTCAC 
AAGGCAGTCT 
ATGATATTGC 
GATTTAGGCT 
AGAACTGTTT 
CAGAACATGC 
GAAGAGAGTC 
C TTTGATC AC 
CTCGACTCCT 
TTAGGAATGC 
GTTCTGGATG 
GATGTCAAAG 
GCTTATGGAC 
TTGTGGAATC 
AGATTTCTTT 
TTTTCCTACC 
C AC TATTTCC 
CCTCTGCCAT 
GTTGACGGCC 
CGGAAACGAC 
CTGCAAGTCC 
TGGAATGCAG 
GTTAATTTTT 
AGATGAATTA 



TGGTGCAATG 
CAGGTTACAA 
AAGTAAAACA 
GTACTCAAGA 
TCGGGCCGTC 
TAAGCTTTTT 
TCTTGTGGGG 
GGAATTTGGC 
AGGCATACGT 
CAAGGGAGCG 
AAAAGAAAAG 
AAATTC AC CC 
CAAAGAACGC 
TAGCTTGGGA 
CTTGCTCTGG 
GCTAAAGAGT 
TCTCACCTCT 
TTCGAGATTC 
GACAAGCAAT 
GACGTTTTCT 
CTCGCTTGCG 
ATTATGATTC 
CTATATGGAA 
ACTTTTTGAT 
GAAAAAAAAG 
A 



C TTTTCTG AT 
CATTGTATAG 
GTTTATCTAT 
TTTTCCGCTA 
TAT C TTTTGA 
AAATGT AC AG 
GATTTCCTTG 
CTACTCCTGT 
CTCTAAAATG 
TCTTTCCCTC 
ACAACTCTTA 
AGAATACAGT 
GTTCTTCATC 
GCGTATTCCT 
AGACATTAGT 
TCGTGTG CCA 
ACAGGAAAGC 
CTATGTACAT 
ATCGCGTGCA 
ATTTTCTGTA 
CTCCTGTTCC 
TTGGGGAAAA 
ATTTTTGCTT 
TAATATTCCT 
TGACCTCTGA 



The PSORT algorithm predicts an inner membrane location (0.463). 

The protein was expressed in E.coli and purified as a GST-fusion product, as shown in Figure 51 A. 
The recombinant protein was used to immunise mice, whose sera were used in a Western blot 
(Figure 5 IB) and for FACS analysis (Figure 51C). A his-tagged protein was also expressed. 

This protein also showed good cross-reactivity with human sera, including sera from patients with 
pneumonitis. 

These experiments show that cp6696 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 



Example 52 

The following C.pneumoniae protein (pid 4376790) was expressed <SEQ ID 103; cp6790>: 

1 MSEHKKSSKI IGIDLGTTNS CVSVMEGGQA KVITSSBGTR TTPSIVAFKG 

51 NEKLVGIPAK RQAVTNPEKT LGSTKRFIGR KYSEVASEIQ TVPYTVTSGS 

101 KGDAVFEVDG KQYTPEEIGA QILMKMKETA EAYIrGETVTE AVITVPAYFN 

151 DSQRASTKDA GRIAGLDVKR IIPEPTAAAL AYGIDKVGDK KIAVFDLGGG 

201 TFDISILEIG DGVFEVIjSTN GDTLLGGDDF DEVIIKWMIE EFKKQEGIDL 

251 SKDNMALQRL KDAAEKAKIE LSGVSSTEIN QPFITMDAQG PKHLALTLTR 

301 AQFEKLAASL 1ERTKSPCIK ALSDAKLSAK DIDDVLLVGG MSRMPAVQET 

351 VKELFGKEPN KGVNPDEWA IGAAIQGGVL GGEVKDVLLL DVIPLSIiGIE 

401 TLGGVMTTLV ERNTTIPTQK KQIFSTAADN QPAVTIWLQ GERPMAKDNK 

451 BIGRFDLTDI PPAPRGHPQI EVSFDIDANG IFHVSAKDVA SGKEQKIRIE 

501 ASSGLQEDEI QRMVRDAEIN KEEDKKRREA SDAKNEADSM IFRAEKAIKD 

551 YKEQIPETLV KEIEERIENV RNALKDDAPI EKIKEVTEDL SKHMQKIGES 

601 MQSQSASAAA SSAANAKGGP NINTEDLKKH SFSTKPPSNN GSSEDHIEEA 
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651 DVEIIDNDDK* 

The cp6790 nucleotide sequence <SEQ ID 104> is: 



1 




51 


A AP A A APTP P 


101 


PA r PPATPPP.A 


151 


AA'PP.AP.A A AT 


£> U X. 


ACS A A A A A aPT 






JUJ. 




TCI 
J Jl 


AAi l\jvaCGL.A 


Art i 


1 AGGCGAAAC 


4j1 


bAl i 1 L. AAC- 


cni 
I?U1 


IXaTAAAACGT 


rci 
D3l 


TCGATAAAGT 


bUJ. 


ACTTTTGATA 


bbl 


ATCTACAAAT 


701 


TTATCAAATG 


751 


AGCAAAGATA 


801 


AAAAATAGAA 


851 


TCACAATGGA 


901 


GCGCAATTCG 


951 


ATGCATCAAA 


1001 


ATGTTCTCTT 


1051 


GTAAAAGAAC 


1101 


AGTTGTTGCT 


1151 


TTAAGGATGT 


1201 


ACTCTAGGAG 


1251 


TACACAGAAA 


1301 


TTACCATCGT 


1351 


GAAATCGGAA 


1401 


TCCTCAAATC 


1451 


TCTCAGCTAA 


1501 


GCAAGCTCAG 


1551 


CGAAATTAAT 


1601 


AAAATGAAGC 


1651 


TATAAGGAGC 


1701 


CGAAAACGTG 


1751 


AAGAGGTTAC 


1801 


ATGCAATCGC 


1851 


AGGTGGACCT 


1901 


CGAAGCCTCC 


1951 


GATGTAGAAA 



ACAAAAAATC 
TGCGTATCTG 
AGGAACAAGA 
TAGTGGGGAT 
CTCGGCTCTA 
GGAAATCCAA 
CCGTTTTCGA 
CAAATCTTAA 
TGTCACAGAA 
GAGCATCCAC 
ATCATTCCAG 
CGGTGATAAA 
TCTCCATCCT 
GGAGATACTC 
GATGATCGAA 
ATATGGCCTT 
CTTTCAGGAG 
TGCACAAGGA 
AGAAACTCGC 
GCACTCAGTG 
AGTTGGAGGT 
TCTTCGGCAA 
ATTGGAGCCG 
TCTACTTCTA 
GCGTCATGAC 
AAACAAATCT 
AGTTCTCCAA 
GATTCGATCT 
GAAGTCTCCT 
AGATGTTGCC 
GACTTCAAGA 
AAGGAAGAAG 
CGATAGCATG 
AAATTCCTGA 
CGCAACGCAC 
TGAAGACCTA 
AGTCTGCATC 
AACATCAATA 
TTCAAATAAC 
TTATTGATAA 



AAGCAAAATT 
TTATGGAAGG 
ACCACGCCAT 
TCCAGCAAAA 
CAAAACGCTT 
ACCGTTCCTT 
AGTTGATGGC 
TGAAAATGAA 
GCAGTGATCA 
AAAAGATGCT 
AACCTACCGC 
AAAATCGCTG 
AGAAATCGGT 
TCCTCGGTGG 
GAATTCAAAA 
ACAAAGACTT 
TCTCTTCCAC 
CCTAAACACC 
AGCCTCTCTA 
ACGCAAAACT 
ATGTCAAGAA 
AGAGCCTAAT 
CAATTCAAGG 
GACGTTATCC 
GACTCTGGTA 
TCTCCACAGC 
GGAGAGCGTC 
TACAGATATC 
TCGATATCGA 
AGCGGTAAAG 
AGATGAAATC 
ATAAAAAACG 
ATCTTCAGAG 
AACTTTAGTT 
TCAAAGATGA 
AGCAAGCATA 
AGCAGCAGCA 
CAGAAGATTT 
GGTTCTTCAG 
CGACGATAAG 



ATAGGTATAG 
AGGACAAGCT 
CGATCGTTGC 
CGTCAAGCAG 
TATTGGCCGT 
ATACAGTCAC 
AAACAATACA 
AG AG AC AG C A 
CCGTCCCCGC 
GGACGCATTG 
AGCAGCTCTT 
TCTTCGACCT 
GATGGCGTCT 
AGACGACTTT 
AACAAGAAGG 
AAAGATGCTG 
AGAAATCAAT 
TTGCATTGAC 
ATCGAAAGAA 
TTCCGCTAAG 
TGCCCGCAGT 
AAAGGAGTCA 
TGGTGTTCTT 
CCCTATCTCT 
GAGAGAAATA 
TGCTGATAAC 
CCATGGCCAA 
CCTCCGGCTC 
TGCAAACGGA 
AACAGAAAAT 
CAAAGAATGG 
TCGTGAAGCT 
CCGAAAAAGC 
AAAGAAATCG 
CGCTCCTATT 
TGCAAAAAAT 
TCATCGGCAG 
GAAAAAACAT 
AAGACCATAT 
TAA 



ACTTAGGCAC 
AAAGTAATTA 
CTTCAAAGGT 
TGACAAATCC 
AAGTACTCTG 
CTCCGGATCT 
CTCCAGAAGA 
GAAGCTTATC 
ATACTTCAAT 
CAGGTCTAGA 
GCCTACGGAA 
TGGTGGAGGA 
TCGAAGTTCT 
GATGAAGTCA 
CATTGATCTT 
CTGAGAAAGC 
CAGCCATTCA 
ACTCACACGT 
CAAAATCTCC 
GATATCGATG 
GCAAGAAACT 
ACCCCGACGA 
GGCGGAGAAG 
GGGTATCGAA 
CTACAATCCC 
CAGCCTGCGG 
AGATAACAAG 
CTCGAGGCCA 
ATTTTCCATG 
TCGTATCGAA 
TTCGAGATGC 
TCAGATGCTA 
TATTAAAGAT 
AAGAGCGAAT 
GAAAAAATTA 
TGGAGAGTCT 
CCAATGCTAA 
AGTTTCAGTA 
CGAAGAAGCT 



The PSORT algorithm predicts an inner membrane location (0.151). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 52A) and a his- 
tagged product. The proteins were used to immunise mice, whose sera were used in Western blot 
(Figure 52B) and FACS (Figure 52C) analyses. 

The cp6790 protein was also identified in the 2D-PAGE experiment (Cpn0503). 

These experiments show that cp6790 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 



Example S3 

The following C.pnewnoniae protein (pid 4376878) was expressed <SEQ ID 105; cp6878>: 

1 MNVPDSKNLH PPAYELIjEIK ARITQSYKEA SAILTAIPDG ILLLSETGHP 

51 LiICNSQAREI bGIDENLEIL NRSFTBVLPD TCLGFSIQEA LESIjKVPKTL* 

101 RLSLCKESKE KEVELFIRKN EISGYLFIQI RDRSDYKQLE NAIERYKNIA 

151 EliGKMTATLiA HEIRNPIiSGI VGF AS I LKKE ISSPRHQRML SSIISGTRSL 

201 NNLVSSMLEY TKSQPLNLKI INLQDFFSSL IPLLSVSFPN CKFVREGAQP 
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251 IiFRS ID PDRM NSWWNLVKN AVETGN SPIT LTLHTSGDIS VTNPGTIPSE 
301 IMDKLFTPFF TTKREGNGLG LAEAQK I IRL HGGDIQLKTS DSAVSFFIII 
351 PELIiAAIiPKE RAAS* 

The cp6878 nucleotide sequence <SEQ ID 106> is: 

1 ATGAACGTCC CTGATTCCAA GAACCTCCAT CCTCCTGCAT ACGAACTCCT 

51 AGAGATCAAG GCTCGCATCA CACAATCTTA T AAAGAAGC G AGTGCTATAC 

101 TGACAGCGAT TCC TGATGGT ATCCTATTAC TTTCTGAAAC AGGACACTTT 

151 CTTATCTGCA ATTCACAAGC ACGTGAAATT CTAGGAATTG ATGAAAATCT 

201 AGAAATTCTT AATAGATCCT TTACCGATGT TCTCCCCGAT ACGTGTCTTG 

251 GATTTTCTAT TCAAGAGGCT CTTGAATCTC TAAAAGTCCC TAAAACTCTT 

301 AGACTCTCTC TCTGTAAAGA ATC TAAAGAA AAAGAAGTGG AACTCTTCAT 

351 CCGTAAAAAC GAGATCAGTG GATACCTGTT TATCCAAATC CGCGATCGGT 

401 CCGACTATAA ACAACTAGAA AAC GCTATAG AAAGATATAA AAATATCGCA 

451 GAACTTGGGA AAATGACGGC TACCCTAGCT CACGAAATCC GCAATCCGCT 

501 AAGTGGAATC GTTGGATTTG CCTCTATCCT AAAGAAAGAG ATTTCCTCTC 

551 CTCGCCACCA ACGAATGCTC TCCTCAATCA TCTCCGGCAC AAGGTCTCTA 

601 AATAACCTTG TCTCTTCTAT GTTAGAATAT ACAAAATCAC AACCGTTGAA 

651 CCTAAAGATT ATAAATTTAC AAGACTTCTT CTCTTCTCTT ATCCCTCTGC 

701 TCTCCGTCTC TTTCC CGAAT TGCAAGTTTG TAAGAGAGGG CGCACAACCT 

751 CTATTCAGAT CTATAGATCC TGATCGGATG AACAGTGTCG TTTGGAACCT 

801 AGTGAAAAAT GCTGTAGAAA CAGGGAACTC TCCGATCACT CTGACCCTGC 

851 ATACATCGGG AGACATCTCG GTAACGAACC CCGGAACGAT TCCTTCCGAG 

901 ATCATGGACA AGCTCTTCAC TCCATTCTTC ACAACAAAGA GAGAGGGAAA 

951 TGGTTTGGGA CTTGCTGAAG CTCAAAAAAT TATAAGACTC CATGGAGGAG 

1001 ATATCCAATT AAAAACAAGC GACTCCGCCG TTAGCTTCTT CATAATCATC 

1051 CCCGAACTTC TAGCGGCCCT AC C C AAAGAA AGAGCCGCTA G 

The PSORT algorithm predicts an inner membrane location (0.204). 

The protein was expressed in Kcoli and purified as a his-tag product (Figure 53A) and as a GST- 
fusion product. The recombinant GST-fusion protein was used to immunise mice, whose sera were 
used in a Western blot (Figure 53B) and for FACS analysis. 

These experiments show that cp6878 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 54 

The following C. pneumoniae protein (pid 4377224) was expressed <SEQ ID 107; cp7224>: 

1 MMKKIRKVAL AVGGSGGHIV PALSVKBAFS REGIDVLLLG KGLKNHP SLQ 

51 QGISYREIPS GLPTVUNPIK IMSRTIiSLCS GYLKARKELK IFDPDLVIGF 

101 GSYHSLPVLL AGLSHKIPLF LHEQNLVPGK VNQLFSRYAR GIGVNFSPVT 

151 KHFRC PAEEV FX.PKRSFSLG SPMMKRCTNH TPTICWGGS QGAQILNTCV 

201 PQALVKLVNK YPNMYVHHIV GPKSDVMKVQ HVYNRGEVLC CVKPFEEQLL 

251 DVIiLAADLVI SRAG AT I LEE ILWAKVPGIL IPYPGAYGHQ EVNAKFFVDV 

301 LEGGTMI LEK ELTEKLLVEK VTFALDSHNR EKQKNSLAAY SQQRSTKTFH 

351 AFICECL* 

The cp7224 nucleotide sequence <SEQ ED 108> is: 

1 ATGATGAAGA AAATTCGAAA AGTAGCCTTG GCTGTAGGAG GTTCAGGAGG 

51 CCACATTGTC CCAGCTCTCT CGGTAAAGGA AGCTTTTTCT CGTGAAGGAA 

101 TAGACGTATT ACTACTAGGG AAAGGTCTCA AGAACCATCC TTCTTTGCAA 

151 CAGGGAATCA GCTATCGGGA AATCCCCTCA GGACTTCCTA CAGTCCTTAA 

201 TCCCATAAAG ATCATGAGCA GGACCCTTTC TCTATGTTCA GGATACCTGA 

251 AAGCAAGAAA GGAACTTAAA ATTTTTGACC CTGACCTGGT CATAGGATTT 

301 GGGAGCTACC ACTCTCTTCC CGTGTTGCTC GCAGGACTGT CCCATAAAAT 

351 TCCCTTATTT CTACACGAAC AAAATCTAGT TCCTGGAAAA GTAAATCAAT 

401 TGTTTTCCCG CTATGCTCGA GGTATTGGAG TGAATTTCTC CCCCGTTACT 

451 AAACACTTCC GCTGCCCCGC AGAAGAGGTC TTCCTTCCTA AACGAAGCTT 

501 CTCCTTAGGA AGCCCTATGA TGAAGCGATG TACAAATCAT ACCCCTACAA 

551 TCTGTGTTGT TGGAGGTTCT CAGGGAGCAC AGATATTAAA TACTTGTGTT 

601 CCCCAAGCTC TTGTCAAGCT AGTCAATAAG TACCCAAATA TGTACGTCCA 
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651 TCATATTGTA GGACCTAAAA GTGATGTTAT GAAGGTGCAA CATGTTTACA 

701 ATCGTGGAGA GGTCCTCTGC TGTGTGAAGC CGTTCGAAGA GCAACTCCTA 

751 GATGTCTTGC TTGCCGCAGA TTTGGTCATC AGTAGGGCAG GAGCCACAAT 

801 TTTAGAAGAA ATTCTTTGGG CAAAAGTTCC CGGAATTTTA ATTCCCTATC 

851 CAGGAGCTTA TGGACATCAG GAAGTTAATG CTAAATTCTT TGTAGACGTC 

901 TTAGAAGGGG GAACTATGAT CCTAGAAAAA GAATTAACAG AGAAGCTATT 

951 AGTAGAAAAA GTAACGTTTG CTTTAGACTC CCATAACAGA GAAAAACAAC 

1001 GCAATTCCCT AGCGGCGTAT AGTCAGCAAA GGTCAACAAA AACATTCCAT 

1051 GCATTCATTT GTGAATGCTT ATAG 



The PSORT algorithm predicts an inner membrane location (0.164). 

The protein was expressed in E.coli and purified as a GST-fusion product, as shown in Figure 54 A. 
The recombinant protein was used to immunise mice, whose sera were used in a Western blot 
(Figure 54B) and for FACS analysis (Figure 54C). A his-tagged protein was also expressed. 

This protein also showed good cross-reactivity with human sera, including sera from patients with 
pneumonitis. 

These experiments show that cp7224 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 55 

The following C.pneumoniae protein (PID 4377140) was expressed <SEQ ID 109; cp7140>: 



1 MVRRS1SFCL FFLMTLLCCT SCNSRSItlVH GLPGREANEI WIiLVSKGVA 

51 AQKLPQAAAA TAGAATEQMW DIAVPSAQIT EALAILNQAG LPRMKGTSLL 

101 DLFAKQGLVP SELQEKIRYQ EGLSEQMAST IRKMDGWDA SVQISFTTEN 

151 EDNL PI/TAS V YIKHRGVLDN PNSIMVSKIK RLIASAVPGLi VPENVSWSD 

201 RAAYSDITIN GPWGLTEEID YVSVWGIILA KSSLTKFRLI FYVLILILFV 

251 I S CGLliWVIVJ KTHTLIMTMG GTKGFFNPTP YTKNALEAKK AEGAAADKEK 

301 KEDADSQGES KNAETSDKDS SDKDAPEGSN EIEGA* 



1 ATGGTTCGTC GATCTATTTC TTTTTGCTTG TTCTTTCTAA TGACATTGCT 

51 GTGCTGTACA AGCTGTAACA GCAGGTCTCT AATTGTGCAC GGTCTTCCTG 

101 GCAGAGAAGC GAATGAGATT GTGGTGCTTT TGGTAAGCAA AGGGGTGGCT 

151 GCACAAAAAT TGCCTCAAGC TGCAGCGGCT ACAGC CGGAG CAGCTACTGA 

201 GCAAATGTGG GATATCGCGG TTCCGTCAGC ACAAATCACA GAGGCCCTTG 

251 C CATTC T AAA TCAAGCGGGT CTTCCACGTA TGAAAGGGAC AAGC CTGTTA 

301 GATCTTTTTG CAAAACAAGG TCTTGTTCCT TC CGAGCTTC AGGAAAAAAT 

351 CCGTTATCAA GAAGGCTTAT CAGAACAGAT GGCCTCTACG ATTAGAAAAA 

401 TGGATGGCGT TGTCGATGCC TCAGTACAGA TTTCCTTCAC TACAGAAAAT 

451 GAAGATAATC TTCCTTTAAC AGCCTCTGTG TATATTAAGC ATCGAGGGGT 

501 TTTGGACAAT CCGAACAGCA TTATGGTTTC CAAAATTAAG CGCCTTATTG 

551 CAAGTGCTGT TCCAGGACTT GTGCCAGAGA ACGTCTCTGT AGTGAGCGAT 

601 CGCGCAGCTT ATAGTGATAT TACAATTAAT GGTCCTTGGG GATTAACAGA 

651 AGAAATCGAT TATGTTTCTG TTTGGGGTAT TATTCTTGCG AAGTCTTCGC 

701 TCACCAAATT CCGTCTCATT TTTTATGTCT TGATTCTCAT TTTATTTGTT 

751 ATTTCTTGTG GTCTC CTTTG GGTCATTTGG AAAACTCATA CTCTCATTAT 

801 GACTATGGGA GGTACAAAAG GGTTCTTCAA CCCTACACCA TATACAAAGA 

851 ATGCCTTGGA AGCCAAGAAA GCCGAGGGAG CAGCTGCTGA CAAAGAGAAA 

901 AAAGAAGATG CAGATTCACA GGGGGAAAGC AAAAATGCGG AAACCAGTGA 

951 TAAAGACTCT AGTGATAAAG ATGCTC CAGA AGGAAGCAAT GAAATTGAGG 

1001 GTGCTTAG 



A predicted signal peptide is highlighted. 



The cp7140 nucleotide sequence <SEQ ID 



110>is: 



The PSORT algorithm predicts an inner membrane location (0.650). 
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The protein was expressed in Kcoli and purified as a GST-fusion product, as shown in Figure 55A. 
The recombinant protein was used to immunise mice, whose sera were used in a Western blot 
(Figure 55B) and for FACS analysis (Figure 55C). A his-tagged protein was also expressed. 

These experiments show that cp7140 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 56 

The following C.pnewnoniae protein (pid 4377306) was expressed <SEQ ID 1 1 1; cp7306>: 

1 MITKQIiRSWI* AVEVGSSLLA LPLSGQAVGK KESRVSELPQ DVLLKEISGG 

51 FSKVATKATP AWYIESFPK SQAVTHPSPG RRGPYENPFD YFNDEFFNRF 

101 FGLPSQREKP QSKEAVRGTG FLVSPDGYIV TNNHWEDTG KIHVTLHDGQ 

151 KYPATVIGLD PKTDLAVIKI KSQNLPYLSF GNSDHLKVGD WAIAIGNPFG 

201 LQATVTVGVI SAKGRNQLHI ADFEDFIQTD AA IN PGNSGG PLLNIDGQVI 

251 GVNTAIVSGS GGYIGIGFAI PSLMANRIID QLIRDGQVTR GFLGVTLQPI 

301 DAKLAACYKD EKVYGALVTD WKGSPADKA GLKQEDVIIA YNGKEVDSLS 

351 MFRNAVSLMN PDTRIVLKW REGKVIEIPV TVSQAPKEDG MSALQRVGIR 

401 VQNLTPETAK KLGIAPETKG IlillSVEPGS VAASSGIAPG QLILAVNRQK 

451 VSSIEDLNRT LKDSNNENIL LMVSQGDVIR FIALXPEE* 

A predicted signal peptide is highlighted. 

The cp7306 nucleotide sequence <SEQ ED 1 12> is: 

1 ATGATAACTA AGCAATTGCG TTCGTGGCTA GCTGTACTTG TTGGTTCAAG 

51 TCTGCTAGCT CTTCCTTTAT CAGGGCAAGC TGTCGGGAAA AAAGAATCTC 

101 GAGTTTCCGA GCTGCCTCAA GACGTTCTTC TTAAAGAGAT CTCGGGAGGG 

151 TTTTCTAAGG TCGCTACCAA GGCGACTCCC GCTGTTGTGT ACATAGAAAG 

201 TTTCCCAAAG AGCCAGGCTG TAACACATCC TTCTCCTGGA CGCCGTGGGC 

251 CTTATGAAAA TCCTTTTGAT TATTTTAATG ATGAGTTTTT CAATCGTTTT 

301 TTTGGTCTAC CTTCACAGAG GGAAAAACCT CAAAGTAAAG AGGCGGTTCG 

351 AGGAACAGGT TTCCTAGTAT CTCCAGATGG CTATATTGTG ACTAATAACC 

401 ATGTTGTCGA AGATACAGGT AAGATTCACG TAACTCTTCA TGATGGGCAA 

451 AAGTACCCAG CAACTGTAAT CGGACTCGAT CCTAAAACAG ACCTTGCAGT 

501 CATTAAAATT AAATCCCAAA ACCTCCCGTA TCTTTCTTTT GGAAACTCCG 

551 ACCACTTAAA AGTCGGAGAT TGGGCAATTG CAATTGGAAA TCCCTTCGGT 

601 CTTCAAGCTA CGGTCACCGT AGGTGTCATC AGTGCTAAAG GAAGAAATCA 

651 ACTCCACATT GCAGATTTTG AAGATTTTAT TCAGACAGAT GCTGCGATTA 

701 ATCCAGGCAA CTCTGGAGGC CCTCTTCTAA ATATTGATGG ACAGGTCATC 

751 GGTGTTAATA CTGCCATTGT CAGTGGTAGT GGTGGCTATA TTGGAATCGG 

801 GTTTGCGATT CCTAGCCTTA TGGCAAATAG AATCATAGAT CAGCTGATTC 

851 GTGATGGTCA AGTTACCCGA GGATTCTTAG GAGTG AC TTT ACAACCTATA 

901 GATGCGGAAC TCGCTGCTTG C TACAAACTC GAAAAGGTTT ATGGCGCTTT 

951 AGTCACAGAT GTTGTTAAAG GATCTCCAGC AGATAAAGCA GGGCTAAAAC 

1001 AAGAAGATGT GATCATTGCT TATAATGGGA AAGAAGTCGA TTCACTGAGT 

1051 ATGTTCCGTA ATGCTGTTTC TTTAATGAAT CCAGATACAC GTATTGTTCT 

1101 AAAGGTAGTT CGTGAAGGAA AGGTTATCGA AATACCCGTG ACAGTTTCTC 

1151 AAGCTCCAAA AGAAGATGGA ATGTCGGCTT TACAGCGTGT GGGAATCCGT 

1201 GTGCAAAACC TAACTCCTGA AACTGCTAAG AAGCTGGGAA TTGCTCCAGA 

1251 GACTAAAGGC ATTTTGATTA TAAGTGTTGA ACCAGGGTCT GTAGCAGCTT 

1301 CTTCAGGAAT TGCTCCTGGT CAGCTGATCC TTGCTGTGAA TAGACAAAAA 

1351 GTATCTTCGA TTGAAGATCT GAATAGAACG TTAAAAGATT CTAACAATGA 

1401 GAATATTCTT CTTATGGTTT CTCAAGGAGA TGTTATTCGC TTCATTGCCC 

1451 TGAAACCTGA AGAATAA 

The PSORT algorithm predicts a periplasmic location (0.923). 

The protein was expressed in Kcoli and purified as a his-tag product (Figure 56A) and as a GST- 
fusion product (Figure 56B). The recombinant proteins were used to immunise mice, whose sera 
were used in a Western blot (Figure 56C) and for FACS (Figure 56D) analyses. 
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The cp7306 protein was also identified in the 2D-PAGE experiment (Cpn0979) and showed good 
cross-reactivity with human sera, including sera from patients with pneumonitis. 

These experiments show that cp7306 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 57 

The following C.pneumoniae protein (pid 4377132) was expressed <SEQ ID 113; cp7132>: 

1 MCNSIAMKKQ KRGFVLMEIiL MSFTLIA LLL GTLGFWYRKI YTVQKQKERI 

51 YNFYIEESRA YKQLRTLFSM SLSSSYEEPG SLFSLIFDRG VYRDPKLAGA 

101 VRASLHHDTK DQRIiELRICN IKDQSYFETQ RLLSHVTHW LSFQRNPDPE 

151 KLPETIALTI TREPKAYPPR TLTYQFAVGK* 

A predicted signal peptide is highlighted. 

The cp7132 nucleotide sequence <SEQ ID 114> is: 

1 ATGTGTAACT CTATAGCTAT GAAAAAGCAA AAGCGTGGCT TTGTGCTTAT 

51 GGAATTACTC ATGTCGTTCA CTCTAATTGC TTTGTTATTA GGGACTTTAG 

101 GATTTTGGTA TCGGAAAATT TATACTGTAC AAAAGCAAAA AGAACGTATT 

151 TATAACTTTT ATATCGAAGA AAGCCGAGCC TACAAGCAGC TCAGAACCCT 

201 GTTTAGCATG TCCTTGTCTT CATCTTACGA GGAGCCTGGA TCATTATTTT 

251 CTTTAATCTT TGATCGGGGT GTTTATCGAG ATCCTAAGCT GGCAGGTGCG 

301 GTACGAGCTT CTCTCCATCA TGACACCAAG GATCAGAGAT TGGAACTTCG 

351 TATTTGTAAT ATTAAGGATC AGTCTTACTT TGAAACACAG CGACTGCTCT 

401 CCCACGTGAC CCATGTTGTA CTTTCCTTCC AGAGAAATCC TGATCCTGAA 

451 AAACTTCCTG AAACAATTGC TTTAACTATA ACACGGGAAC CTAAAGCATA 

501 TCCTCCAAGG ACGTTAACAT ACCAATTTGC GGTTGGGAAA TAA 

The PSORT algorithm predicts a periplasmic location (0.915). 

The protein was expressed in E.coli and purified as a his- tag product (Figure 57 A) or as a 
GST-fusion. The recombinant proteins were used to immunise mice, whose sera were used in a 
Western blot (Figure 57B) and FACS (Figure 57C) analyses. 

These experiments show that cp7132 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 58 

The following Cpneumoniae protein (pid 4376733) was expressed <SEQ ID 115; cp6733>: 

1 MKTSIPWVLV SSVTAFS CHL QSLANEELLS PDDSFNGNID SGTFTPKTSA 

51 TTYSLTGDVF FYEPGKGTPL SDSCFKQTTD NLTFLGNGHS LTFGFIDAGT 

101 HAGAAASTTA NKNLTFSGFS LLSFDSSPST TVTTGQGTLS SAGGVNLENI 

151 RKLWAGNFS TADGGAIKGA SFLIjTGT SGD ALFSNNSSST KGGAIATTAG 

201 ARIANNTGYV RFLSNIASTS GGAIDDEGTS ILSNNKFLYF EGNAAKTTGG 

251 AICNTKASGS PEL 1 1 SNNKT L I F ASNVAET SGGAIHAKKL ALSSGGFTEF 

301 LRNNVSSATP KGGAI S I DAS GELSLSAETG NITFVRNTLT TTGSTDTPKR 

351 NAINIGSNGK FTELRAAKNH TIFFYDPITS EGTSSDVIiKI NNGSAGALNP 

401 YQGTILFSGE TLTADELKVA DNLKSSFTQP VSLSGGKLUL QKGVTIiESTS 

451 FSQEAGSIiLG MDSGTTLSTT AGSITITNI/3 INVDSLGIiKQ PVSI/TAKGAS 

501 NKVIVSGKLN LIDIEGNIYE SHMPSHDQLF SLLKITVDAD VDTNVDISSL 

551 IPVPAEDPNS EYGFQGQWNV NWTTDTATNT KEATATWTKT GFVPS PERKS 

601 ALVCNTLWGV FTDIRSLQQL VEIGATGMEH KQGFWVSSm 1 NFIiHKTG DEN 

651 RKGFRHTSGG YVIGGSAHTP KDDLFTFAFC HLFARDKDCF IAHNNSRTYG 

701 GTLFFKHSHT LQPQNYIjRIjG RAKFSESAIE KFPREIPLAL DVQVSFSHSD 

751 NRMETHYTSIi PESEGSWSNE CIAGGIGLDL PFVLSNPHPL FKTFIPQMKV 

801 EMVYVSQNSF FESSSDGRGF SIGRLLNLSI PVGAKFVQGD IGDSYTYDLS 
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851 GFFVSDVYRN NPQSTATLVM SPDSWKIRGG NLSRQAFLLR GSNNYVYNSN 

901 CELFGHYAME LRGSSRNYNV DVGTKLRF* 

A predicted signal peptide is highlighted. 

The cp6733 nucleotide sequence <SEQ ID 1 16> is: 

1 ATGAAGACTT CGATTCCTTG GGTTTTAGTT TCCTCCGTGT TAGCTTTCTC 

51 ATGTCACCTA CAGTCACTAG CTAACGAGGA ACTTTTATCA CCTGATGATA 

101 GCTTTAATGG AAATATCGAT TCAGGAACGT TTACTCCAAA AACTTCAGCC 

151 ACAACATATT CTCTAACAGG AGATGTCTTC TTTTACGAGC CTGGAAAAGG 

201 CACTCCCTTA TCTGACAGTT GTTTTAAGCA AACCACGGAC AATCTTACCT 

251 TCTTGGGGAA CGGTCATAGC TTAACGTTTG GCTTTATAGA TGCTGGCACT 

301 CATGCAGGTG CTGCTGCATC TACAACAGCA AATAAGAATC TTACCTTCTC 

351 AGGGTTTTCC TTACTGAGTT TTGATTCCTC TCCTAGCACA ACGGTTACTA 

401 CAGGTCAGGG AACGCTTTCC TCAGCAGGAG GCGTAAATTT AGAAAATATT 

451 CGTAAACTTG TAGTTGCTGG GAATTTTTC T ACTGCAGATG GTGGAGCTAT 

501 CAAAGGAGCG TCTTTCCTTT TAACTGGCAC TTCTGGAGAT GCTCTTTTTA 

551 GTAACAACTC TTCATCAACA AAGGGAGGAG CAATTGCTAC TACAGCAGGC 

601 GCTCGCATAG CAAATAACAC AGGTTATGTT AGATTCCTAT CTAACATAGC 

651 GTCTACGTCA GGAGGCGCTA TCGATGATGA AGGCACGTCG ATACTATCGA 

701 ACAACAAATT TCTATATTTT GAAGGGAATG CAGCGAAAAC TACTGGCGGT 

751 GCGATCTGCA ACACCAAGGC GAGTGGATCT CCTGAACTGA TAATCTCTAA 

801 CAATAAGACT CTGATCTTTG CTTCAAACGT AGCAGAAACA AGCGGTGGCG 

851 CCATCCATGC TAAAAAGCTA GCCCTTTCCT CTGGAGGCTT TACAGAGTTT 

901 CTACGAAATA ATGTCTCATC AGCAACTCCT AAGGGGGGTG CTATCAGCAT 

951 CGATGCCTCA GGAGAGCTCA GTCTTTCTGC AGAGACAGGA AACATTACCT 

1001 TTGTAAGAAA TACCCTTACA AC AAC CGGAA GTACCGATAC TCCTAAACGT 

1051 AATGCGATCA ACATAGGAAG TAACGGGAAA TTCACGGAAT TACGGGCTGC 

1101 TAAAAATCAT ACAATTTTCT TCTATGATCC CATCACTTCA GAAGGAACCT 

1151 CATCAGACGT ATTGAAGATA AATAACGGCT CTGCGGGAGC TCTCAATCCA 

1201 TATCAAGGAA CGATTCTATT TTCTGGAGAA ACC CTAAC AG CAGATGAACT 

1251 TAAAGTTGCT GACAATTTAA AATCTTCATT CACGCAGCCA GTCTCCCTAT 

1301 CCGGAGGAAA GTTATTGCTA CAAAAGGGAG TCACTTTAGA GAGCACGAGC 

1351 TTCTCTCAAG AGGCCGGTTC TCTCCTCGGC ATGGATTCAG GAACGACATT 

1401 ATCAACTACA GCTGGGAGTA TTACAATCAC GAACCTAGGA ATCAATGTTG 

1451* ACTCCTTAGG TCTTAAGCAG CCCGTCAGCC TAACAGCAAA AGGTGCTTCA 

1501 AATAAAGTGA TCGTATCTGG GAAGCTCAAC CTGATTGATA TTGAAGGGAA 

1551 CATTTATGAA AGTCATATGT TCAGCCATGA CCAGCTCTTC TCTCTATTAA 

1601 AAATCACGGT TGATGCTGAT GTTGATACTA ACGTTGACAT CAGCAGCCTT 

1651 ATCCCTGTTC CTGCTGAGGA TCCTAATTCA GAATACGGAT TCCAAGGACA 

1701 ATGGAATGTT AATTGGACTA CGGATACAGC TACAAATACA AAAGAGGCCA 

1751 CGGCAACTTG GACCAAAACA GGATTTGTTC CCAGCCCCGA AAGAAAATCT 

1801 GCGTTAGTAT GCAATACCCT ATGGGGAGTC TTTACTGACA TTCGCTCTCT 

1851 GCAACAGCTT GTAGAGATCG GCGCAACTGG TATGGAACAC AAACAAGGTT 

1901 TCTGGGTTTC CTCCATGACG AACTTCCTGC AT AAGAC TGG AGATGAAAAT 

1951 CGCAAAGGCT TCCGTCATAC CTCTGGAGGC TACGTCATCG GTGGAAGTGC 

2001 TCACACTCCT AAAGACGACC TATTTACCTT TGCGTTCTGC CATCTCTTTG 

2051 CTAGAGACAA AGATTGTTTT ATCGCTCACA ACAACTCTAG AACCTACGGT 

2101 GGAACTTTAT TCTTCAAGCA CTCTCATACC CTACAACCCC AAAACTATTT 

2151 GAGATTAGGA AGAGCAAAGT TTTCTGAATC AGCTATAGAA AAATTC CCTA 

2201 GGGAAATTCC CCTAGCCTTG GATGTCCAAG TTTCGTTCAG CCATTCAGAC 

2251 AACCGTATGG AAACGCACTA TACCTCATTG CCAGAATCCG AAGGTTCTTG 

2301 GAGCAACGAG TGTATAGCTG GTGGTATCGG CCTAGACCTT CCTTTTGTTC 

2351 TTTCCAACCC ACATCCTCTT TTCAAGACCT TCATTCCACA GATGAAAGTC 

2401 GAAATGGTTT ATGTATCACA AAATAGCTTC TTCGAAAGCT CTAGTGATGG 

2451 CCGTGGTTTT AGTATTGGAA GGCTGCTTAA CCTCTCGATT CCTGTGGGTG 

2501 CGAAATTCGT GCAGGGGGAT ATCGGAGATT CCTACACCTA TGATCTCTCA 

2551 GGAT T CTTTG TTTCCGATGT CTATCGTAAC AATCCCCAAT CTACAGCGAC 

2601 TCTTGTGATG AGCCCAGACT CTTGGAAAAT TCGCGGTGGC AATCTTTCAA 

2651 GACAGGCATT TTTACTGAGG GGTAGCAACA ACTACGTCTA CAACTCCAAT 

2701 TGTGAGCTCT TCGGACATTA CGCTATGGAA CTCCGTGGAT CTTCAAGGAA 

2751 CTACAATGTA GATGTTGGTA CCAAACTCCG ATTCTAG 

The PSORT algorithm predicts an outer membrane location (0.924). 
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The protein was expressed in E.coli and purified as a his-tag product, as shown in Figure 58A. The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
58B) and for FACS (Figure 58C) analyses. A GST-fusion protein was also expressed. 

The cp6733 protein was also identified in the 2D-PAGE experiment (Cpn0451). 

These experiments show that cp6733 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 59 

The following C.pneumoniae protein (pid 4376814) was expressed <SEQ ID 117; cp6814>: 

1 MH D ALL S I LA IQELDIKMIR LMRVKKEHQK ELAKVQSLKS DIRRKVQEKE 

51 LEMENLKTQI RDGENRIQEI SEQINKLENQ QAAVKKMDEF NALTQEMTTA 

101 NKERRSLEHQ LSDLMDKQAG GEDLIVSLKE SLASTENSSS VIEKEIFESI 

151 KKINEEGKAL LEQRTELKHA TN PELL SI YE RLLNNKKDRV WPIENRVCS 

201 GCHIVLTPQH ENLVRKKDRL IFCEHCSRIL YWQESQVNAQ ENSTAKRRRR 

251 RAAV* 

The cp6814 nucleotide sequence <SEQ ID 1 18> is: 

1 ATGCATGACG CACTTCTAAG CATTTTGGCT ATTCAAGAGC TTGATATTAA 

51 AATGATTCGC CTTATGCGCG TAAAGAAAGA ACATCAGAAA GAATTGGCTA 

101 AAGTCCAATC TTTAAAAAGT GATATTCGTA GAAAAGTTCA GGAAAAAGAA 

151 CTCGAAATGG AGAATTTGAA AACTCAAATT CGAGATGGAG AGAATCGCAT 

201 CCAAGAGATT TCTGAACAAA TCAATAAATT AGAAAATCAG CAAGCTGCTG 

251 TAAAAAAAAT GGATGAGTTT AACGCTCTTA CCCAAGAAAT GACTACAGCA 

301 AACAAAGAAC GTCGCTCTTT AGAGCACCAG CTTAGCGATC TCATGGATAA 

351 GCAAGCTGGA GGCGAAGACC TTATTGTCTC TCTAAAAGAA AGCTTAGCTT 

401 CTACAGAAAA TAGTAGCAGT GTCATTGAAA AAGAAATTTT TGAAAGCATC 

451 AAAAAGATTA ATGAAGAAGG CAAAGCTTTG CTTGAACAAC GG AC AG AG TT 

501 AAAGCATGCG ACGAATCCCG AACTACTCAG CATCTATGAG CGTCTATTAA 

551 ACAATAAAAA AGATCGCGTT GTTGTTCCTA TTGAAAATCG TGTCTGCAGT 

601 GGTTGTCATA TTGTTCTAAC TCCTCAACAC GAAAATCTTG TAAGAAAGAA 

651 AGACCGACTC ATTTTTTGCG AACATTGCTC TCGAATTCTC TATTGGCAAG 

701 AATCCCAAGT CAATGCTCAG GAAAATTCCA CAGCAAAACG TCGTCGTCGT 

751 CGCGCAGCTG TATAA 

The PSORT algorithm predicts an inner membrane location (0.070). 

The protein was expressed in E.coli and purified as a GST-fusion (Figure 59 A) or his-tagged 
product. The recombinant proteins were used to immunise mice, whose sera were used in Western 
blot (Figure 59B) and FACS (Figure 59C) analyses. 

These experiments show that cp6814 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 60 

The following C.pneumoniae protein (pid 4376830) was expressed <SEQ ID 119; cp6830>: 

1 MKWLPATAVF AAVLPAIiTAF G DPASVEIST SHTGSGDPTS DAALTGFTQS 

51 STBTDGTTYT IVGDITPSTF TNIPVFWTP DANDSSSNSS KGGSSSSGAT 

101 SLIRSSNLHS DFDFTKDSVL DLYHLFFPSA SNTLNPALLS SSSSGGSSSS 

151 SSSSSSGSAS AWAADPKGG AAFYSNEANG TLTFTTDSGN PGSLTLQNUC 

201 MTGDGAAIYS KGPLVFTGLK NLTFTGNESQ KSGGAAYTEG ALTTQAIVEA 

251 VTFTGNTSAG QGGAIYVKEA TLFNALDSLK FEKNTSGQAG GGIYTESTLT 

301 ISNITKSIEF ISNKASVPAP APEPTSPAPS SLINSTTIDT STLQTRAASA 

351 TPAVAPVAAV TPTPISTQET AGNGGAIYAK QGISISTFKD LTFKSNSASV 
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401 DATLTVDSST IGESGGAIFA ADSIQIQQCT GTTLF SGNTA NKSGGGIYAV 

451 GQVTLEDIAN LKMTNNTCKG BGGAIYTKKA LTINNGAILT TFSGNT STDN 

501 GGAIFAVGGI TLSDLVEVRF SKNKTGNYSA PITKAASNTA PWSSSTTAA 

551 SPAVPAAAAA PVTNAAKGGA LYSTEGLTVS GITSILSFEN NECQNQGGGA 

601 YVTKTFQCSD SHRLQFTSNK AADEGGGLYC GDDVTLTNLT GKTLFQENSS 

651 EKHGGGLSLA SGKSLTMTSL ESFCLNANTA KENGGGANVP ENIVLTFTYT 

701 PTPNEPAPVQ QPVYGEALVT GNTATK SGGG IYTKNAAFSN LSSVTFDQNT 

751 SSENGGALLT QKAADKTDCS FTYITNVNIT NNTATGNGGG IAGGKAHFDR 

801 IDNLTVQSNQ AKKGGGVYLE DALILEKVIT GSVSQNTATE SGGGIYAKDI 

851 QliQALPGSFT ITDNKVETSIi TTSTNLYGGG IYSSGAVTIiT NISGTFGITG 

901 NSVINTATSQ DADIQGGGIY ATTSLSINQC NTPILFSNNS AATKKTSTTK 

9 51 QIAGGAIFSA AVTIENNSQP IIFLNNSAKS EATTAATAGN KDSCGGAIAA 

1001 NSVTLTNNPE I TFKGNYAET GGAIGCIDLT NGSPPRKVSI ADNGSVLFQD 

1051 NSALNRGGAI YGETIDISRT GATF IGNSSK HDGSAICCST ALTLAPNSQb 

1101 IFENNKVTET TATTKAS INN LGAAIYGNNE TSDVTISLSA ENGSIFFKNN 

1151 LCTATNKYCS IAGNVKFTAI EASAGKAISF YDAVNVSTKE TNAQELKLNE 

1201 KATSTGTILF SGELHENKSY IPQKVTFAHG NL I LGKNAEL SWSFTQSPG 

1251 TTITMGPGSV LSNHSKEAGG IAINNVIIDF SEIVPTKDNA TVAPPTLKLV 

1301 SRTNADSKDK IDITGTVTIjIi DPNGNLYQNS YLGEDRDITL FNIDNSASGA 

1351 VTATNVTLQG NLGAKKGYLG TWNLDPNSSG SKI ILKWTFD KYLRWPYI PR 

1401 DNHFYINSIW GAQNSLVTVK QGILGNMLNN ARFEDPAFNN FWASAIGSFL 

1451 RKEVSRNSDS FTYHGRGYTA AVDAKPRQEF ILGAAFSQVF GHAESEYHLD 

1501 NYKHKGSGHS TQASLYAGNI FYFPAIRSRP ILFQGVATYG YMQHDTTTYY 

1551 PSIEEKNMAN WDSIAWLFDL RFSVDLKEPQ PHSTARLTFY TEAEYTRIRQ 

1601 EKFTELDYDP RSFSACSYGN LAIPTGFSVD GALAWREIIL YNKVSAAYLP 

1651 VILRNNPKAT YEVLSTKEKG NWNVLPTRN AARAEVSSQI YliGSYWTLYG 

1701 TYT I DASMNT LVQMANGGIR FVF* 

A predicted signal peptide is highlighted. 

The cp6830 nucleotide sequence <SEQ ID 120> is: 

1 ATGAAGTGGC TACCAGCTAC AGCTGTTTTT GCTGCCGTAC TCCCCGCACT 

51 AACAGCCTTC GGAGATCCCG CGTCTGTTGA AATAAGTACC AGCCATACAG 

101 GATCCGGGGA TCCTACAAGC GACGCTGCCT TAACAGGATT TACACAAAGT 

151 TCCACAGAAA CTGACGGTAC TACCTATACC ATTGTCGGTG ATATCACCTT 

201 CTCTACTTTT ACGAATATTC CTGTTCCCGT AGTAACTCCA GACGCCAACG 

251 ATAGTTCCAG CAATAGCTCT AAAGGAGGAA GTAGCAGTAG TGGAGCTACA 

301 TCTCTAATCC GATCCTCAAA CCTACACTCC GATTTTGATT TTACAAAAGA 

351 TAGCGTGTTA GACCTCTATC ACCTTTTCTT TCCTTCAGCT TCAAATACTC 

401 TCAATCCTGC ACTCCTTTCT TCCAGTAGCA GCGGTGGATC CTCGAGCAGC 

451 AGTAGCTCCT CATCATCTGG AAGTGCATCT GCTGTTGTTG CTGCGGACCC 

501 AAAAGGAGGC GCTGCCTTTT ATAGTAACGA GGCTAACGGA ACTTTAACCT 

551 TCACTACAGA CTCTGGAAAT CCCGGCTCCC TGACTCTTCA GAATCTTAAA 

601 ATGACCGGAG ATGGAGCCGC CATCTACTCG AAGGGTCCTC TAGTATTTAC 

651 TGGTTTAAAA AATCTAACCT TTACAGGAAA TGAATC TC AG AAATCTGGAG 

701 GTGCTGCCTA TACTGAAGGC GCACTCACAA CACAAGCAAT CGTTGAAGCC 

751 GTAACTTTTA CTGGCAACAC CTCGGCAGGG CAAGGAGGCG CTATCTATGT 

801 TAAAGAAGCT ACCCTATTCA ATGCTCTAGA CAGCCTCAAA TTTGAAAAAA 

851 ACACTTCTGG GCAAGCTGGT GGTGGAATCT ATACAGAGTC TACGCTCACA 

901 ATCTCGAACA TCACAAAATC TATTGAATTT ATCTCTAATA AAGCTTCTGT 

951 CCCTGCCCCC GCTCCTGAGC CCACCTCTCC GGCTCCAAGT AGCTTAATAA 

1001 ATTCTACAAC GATCGATACC TCGACTCTCC AAACCCGAGC AGCATCCGCA 

1051 ACTCCAGCAG TGGCTCCTGT TGCTGCCGTA ACTCCAACAC CAATCTCTAC 

1101 TCAAGAGACC GCAGGAAATG GAGGCGCTAT CTATGCTAAA CAAGGTATTT 

1151 CGATATCCAC GTTTAAAGAT CTGACCTTCA AGTCTAACTC TGCATCGGTA 

1201 GATGCCACCC TTACTGTCGA TTCTAGCACT ATTGGAGAAT CTGGAGGTGC 

1251 TATCTTTGCA GCAGACTCTA TACAAATCCA ACAGTGCACG GGAACCACCT 

1301 TATTCAGTGG CAATACTGCC AATAAGTCTG GTGGGGGTAT TTACGCTGTA 

1351 GGACAAGTCA CCCTAGAAGA TATAGCGAAT CTGAAGATGA CCAACAACAC 

1401 CTGTAAAGGT GAAGGTGGAG CCATCTACAC TAAAAAGGCT TTAACTATCA 

1451 ACAAC GGTGC C ATT CTCACT ACATTTTCTG GAAATACATC GACAGATAAT 

1501 GGTGGGGCTA TTTTTGCTGT AGGTGGCATC ACTCTCTCTG ATC TTG TAG A 

1551 AGTCCGCTTT AGTAAAAATA AGACCGGAAA TTATTCCGCT CCTATTACCA 

1601 AAGCGGCTAG CAACACAGCT CCTGTAGTTT CTAGCTCTAC AACTGCTGCA 

1651 TCTCCTGCGG TCCCTGCTGC CGCTGCAGCA CCTGTTACAA ACGCAGCAAA 

1701 AGGAGGGGCT TTATATAGTA CAGAAGGACT GACTGTATCT GGAATCACAT 

1751 CGATATTGTC GTTTGAAAAC AACGAATGCC AGAATCAAGG AGGTGGGGCT 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 
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1801 
1851 
1901 
1951 
2001 
2051 
2101 
2151 
2201 
2251 
2301 
2351 
2401 
2451 
2501 
2551 
2601 
2651 
2701 
2751 
2801 
2851 
2901 
2951 
3001 
3051 
3101 
3151 
3201 
3251 
3301 
3351 
3401 
3451 
3501 
3551 
3601 
3651 
3701 
3751 
3801 
3851 
3901 
3951 
4001 
4051 
4101 
4151 
4201 
4251 
4301 
4351 
4401 
4451 
4501 
4551 
4601 
4651 
4701 
4751 
4801 
4851 
4901 
4951 
5001 
5051 
5101 
5151 



TACGTT AC T A 
TAGTAATAAA 
TCACGCTAAC 
GAGAAACATG 
GACATCGTTA 
GAGGCGGTGC 
CCCACTCCAA 
TCTTGTTACT 
AAAATGCGGC 
TCTTCAGAAA 
GGACTGTTCT 
CTACAGGAAA 
ATTGATAATC 
TTATCTTGAA 
CACAAAATAC 
CAACTACAAG 
AACTAGTCTT 
GTGGAGCTGT 
AACTCTGTTA 
GGGCATTTAT 
TTCTATTTAG 
CAAATTGCTG 
CTCTCAGCCC 
CAGCAGCAAC 
AACTCTGTTA 
TGCAGAAACT 
CTCCCCGTAA 
AACTCTGCGT 
CTCCAGGACA 
GTGCAATTTG 
ATCTTTGAAA 
CATAAATAAT 
TCACTATCTC 
CTATGCACAG 
TACAGCAATA 
TTAACGTTTC 
AAAGCGACAA 
TAAATCCTAT 
TAGGTAAAAA 
ACCACAATCA 
AGCAGGAGGA 
TTCCTACTAA 
TCGAGAACTA 
GACTCTTCTA 
AAGACCGCGA 
GTTACAGCCA 
ATATTTAGGA 
TTCTAAAATG 
GACAACCACT 
GACTGTGAAA 
AAGATCCTGC 
AGGAAAGAAG 
CTATACCGCT 
CTGCCTTCAG 
AACTATAAGC 
TGGCAATATC 
AAGGTGTGGC 
CCTTCTATTG 
ATTTGATCTG 
CAGCAAGGCT 
GAGAAATTCA 
TTATGGAAAC 
CTTGGCGTGA 
GTGATTCTCA 
AGAAAAGGGC 
C AG AGGTG AG 
ACGTATACTA 
AGGGATCCGG 



AAACCTTCCA 
GCAGCAGATG 
GAACCTGACA 
GAGGTGGGCT 
GAGAGCTTCT 
GAATGTC CCT 
ATGAACCTGC 
GGAAATACAG 
CTTCTCAAAT 
ATGGTGGTGC 
TTCACCTATA 
TGGTGGGGGC 
TTACAGTCCA 
GATGCCCTCA 
AGCTACAGAA 
CTCTACCTGG 
ACT AC T AGC A 
CACGCTAACC 
TCAATACAGC 
GCAACCACGT 
CAACAACTCT 
GTGGGGCTAT 
ATTATTTTCT 
TGCAGGAAAT 
CTTTAACAAA 
GGAGGAGCGA 
AGTC TCTATT 
TAAATCGCGG 
GGTGCG AC TT 
CTGTTCAACA 
ACAATAAGGT 
TTAGGAGCTG 
TTTATCAGCT 
CAACAAACAA 
GAAGCTTCAG 
CACCAAAGAA 
GTACAGGAAC 
ATTCCACAGA 
TGCAGAACTT 
CTATGGGCCC 
ATCGCTATAA 
AGATAATGCA 
ATGCAGATAG 
GATCCTAATG 
TATCACTCTT 
CGAATGTCAC 
ACCTGGAATT 
GACCTTTGAC 
TCTACATCAA 
CAAGGGATCT 
TTTCAACAAC 
TATCTCGAAA 
GCTGTGGATG 
TCAGGTTTTT 
ATAAAGGCTC 
TTCTATTTTC 
GACCTATGGT 
AAGAAAAAAA 
CGTTTCAGTG 
TACCTTCTAT 
CAGAGCTAGA 
TT AG CAATTC 
GATTATTCTA 
GGAATAATCC 
AACGTAGTCA 
CTCTCAAATT 
TTGATGCTTC 
TTTGTATTCT 



GTGTTCCGAT 
AAGGCGGGGG 
GGGAAAACAC 
CTCTCTCGCC 
GCTTAAATGC 
GAAAATATTG 
GCCTGTGCAG 
CCACAAAAAG 
TTATCTTCTG 
CTTACTTACC 
TTACAAATGT 
ATTGCTGGGG 
AAGCAACCAA 
TCCTGGAAAA 
AGTGGTGGGG 
AAGCTTCACA 
CTAATTTATA 
AATATATCTG 
GACATCCCAG 
CTCTCTCAAT 
GCTGC CACTA 
CTTCTCCGCT 
TAAATAATTC 
AAAGATAGCT 
TAACCCTGAA 
TTGGCTGTAT 
GCAGACAACG 
AGG CGCTATC 
TCATCGGTAA 
GCCCTAACTC 
TACGGAAACC 
CAATTTATGG 
GAGAATGGAA 
AT AC TGCAGT 
CAGGGAAAGC 
ACAAATGCTC 
GATTCTATTT 
AAGTCACTTT 
AGCGTAGTTT 
AGGATCGGTT 
ACAATGTCAT 
ACAGTAGCTC 
TAAAGATAAG 
GCAACTTATA 
TTCAATATAG 
CCTTCAAGGG 
TGG ATC C AAA 
AAATACCTGC 
CTCTATTTGG 
TAGGGAACAT 
TTCTGGGCTT 
TTCTGACTCA 
CCAAACCTCG 
GGTCACGCCG 
AGGTCACTCT 
CTGCGATACG 
TATATGCAAC 
TATGGCAAAC 
TGG ATC TT AA 
ACAGAAGCTG 
CTATGATCCT 
CTACTGGATT 
TATAATAAAG 
AAAAGCGACC 
ACGTTCTCCC 
TATCTTGGAA 
AATGAATACT 
AG 



TCTCATCGCC 
CCTGTATTGT 
TATTTCAAGA 
TCAGGAAAAT 
AAATACAGCA 
TACTCACCTT 
CAGCCCGTGT 
TGGTGGGGGC 
TAACTTTTGA 
CAAAAAGCTG 
CAATATCACC 
GAAAAGCACA 
GCAAAGAAAG 
GGTTATTACA 
GTATCTACGC 
ATTACCGATA 
TGGTGGGGGC 
GAACCTTTGG 
GATGCAGATA 
AAATCAATGT 
AAAAAACATC 
GCAGTAACTA 
CGCAAAGTCG 
GTGGAGGAGC 
ATAACCTTTA 
TGATCTTACT 
GTTCTGTCCT 
TATGGAGAGA 
CTCTTCAAAA 
TTGCGCCAAA 
ACAGCCACTA 
AAATAATGAG 
GTATTTTCTT 
ATTGCTGGAA 
TATATCTTTC 
AAGAGCTAAA 
TCTGGGGAAC 
CGCACATGGG 
CCTTTACCCA 
CTTTCCAACC 
CATTGATTTT 
CACCCACTCT 
ATTGATATTA 
TCAAAATTCT 
ACAATTCTGC 
AATTTAGGAG 
TTCCTCGGGT 
GCTGGCCCTA 
GGAGCACAAA 
GTTGAACAAT 
CGGCTATAGG 
TTCACCTATC 
CCAAGAATTT 
AGTCTGAATA 
ACACAAGCAT 
GTCTCGGCCT 
ATGACACCAC 
TGGGATAGCA 
AGAACCTCAA 
AGTATACCAG 
AGATCTTTCT 
CTCTGTAGAC 
TATCAGCTGC 
TATGAAGTTC 
TACAAGAAAC 
GTTACTGGAC 
TTAGTGCAAA 



TCCAGTTTAC 
GGTGACGATG 
GAATAGCAGT 
CTCTGACTAT 
AAGGAAAACG 
CACCTATACT 
ATGGAGAAGC 
ATTTACACGA 
TCAAAATACC 
CAGATAAAAC 
AACAATACAG 
TTTCGATCGC 
GTGGTGGGGT 
GGTTCTGTCT 
TAAGGATATT 
ATAAAGTCGA 
ATCTATTCCA 
CATTACAGGA 
TACAAGGTGG 
AATACACCCA 
AACAACAAAG 
TCGAGAATAA 
GAAGCAACTA 
CATTGCAGCT 
AAGGAAATTA 
AATGGCTCAC 
TTTTCAAGAC 
CTATCGATAT 
CATGATGGAA 
CTCCCAACTT 
CAAAAGCTTC 
AC TAGTGACG 
TAAAAACAAT 
ACGTAAAATT 
TATGATGCAG 
ATTAAATGAA 
TTCACGAAAA 
AATCTCATTC 
ATCTCCAGGC 
ATAGCAAAGA 
AGTGAAATCG 
TAAATTAGTA 
CAGGAACTGT 
TATCTTGGTG 
AAGTGGGGCA 
CTAAAAAAGG 
TCAAAAATTA 
CATCCCTAGA 
ACTCTTTAGT 
GCAAGGTTTG 
ATCTTTCCTT 
ATGGCAGAGG 
ATTTTAGGAG 
TCACCTTGAC 
CTCTTTATGC 
ATTCTATTCC 
AACCTACTAT 
TTGCTTGGTT 
CCTCACTCTA 
AATTCGCCAG 
CTGCATGCTC 
GGAGCATTAG 
GTACCTCCCT 
TCTCTACAAA 
GCAGCTCGTG 
ACTCTACGGC 
TGGCCAACGG 
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The PSORT algorithm predicts an outer membrane location (0.926). 

The protein was expressed in Ecoli and purified as a GST-fusion (Figure 60 A) or his-tagged 
product. The recombinant proteins were used to immunise mice, whose sera were used in Western 
blot (Figure 60B) and FACS (Figure 60C) analyses. 

The cp6830 protein was also identified in the 2D-PAGE experiment (Cpn0540) and showed good 
cross-reactivity with human sera, including sera from patients with pneumonitis. 

These experiments show that cp6830 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 61 

The following C.pneumoniae protein (pid 4376854) was expressed <SEQ ID 121; cp6854>: 

1 MSIAIAREQY AAILDMHPKP SIAMFSSEQA RTSWEKRQAH PYLYRLLEII 

51 WGWKFL.LGL IFFIPLGLFW VLQKICQNFI LLGAGGWIFR PICRDSNLLR 

101 QAYAARLFSA SFQDHVSSVR RVCLQYDEVF I DGL ELRLr PN AKPDRWMLIS 

151 NGN S DC LEYR TVTiQGEKDWI FRIAEESQSN IIiIFNYPGVM KSQGNITRNN 

201 WKSYQACVR YLRDEPAGPQ ARQ I VAYGYS LGASVQAEAL SKEIADGSDS 

251 VRWFWKDRG ARSTGAVAKQ FIGSLGVWLA NLTHWNINSE KRSKDLHCPE 

301 LFIYGKDSQG NlilGDGLFKK ETCFAAPFLD PKNLEECSGK KIPVAQTGLR 

351 HDHILSDDVI KEVAGH IQRH FDN* 

The cp6854 nucleotide sequence <SEQ ID 122> is: 

1 ATGTCAATAG CTATTGCAAG GGAACAATAC GCAGCTATAT TGGATATGCA 

51 TCCTAAACCT TCGATCG CCA TGTTTTCTTC GGAGCAGGCG AGAACTTCTT 

101 GGGAGAAACG ACAGGCTCAT CCTTACCTTT ATCGTCTTCT TGAGATCATA 

151 TGGGGTGTTG TGAAATTTCT TCTCGGCTTA ATCTTCTTTA TTCCCTTGGG 

201 TCTTTTCTGG GTCCTTCAGA AGATATGTCA GAATTTTATT CTTCTTGGTG 

251 CAGGAGGGTG GATTTTTAGA CCCATATGCA GGGACTCTAA TTTATTGCGA 

301 CAAGCTTACG CCGCGCGTCT TTTCTCCGCT TCATTCCAAG ATCATGTCTC 

351 CTCTGTGCGA AGGGTTTGCT TACAGTATGA CGAGGTCTTT ATTGACGGAT 

401 TGGAGTTACG TCTTCCCAAT GCTAAGCCAG ATCGATGGAT GTTAATCTCC 

451 AATGGAAACT CCGATTGCTT AGAGTATAGG ACAGTGCTGC AAGGGGAAAA 

501 GGACTGGATA TTCCGTATTG CTGAAGAGTC TCAATCCAAC ATTTTAATCT 

551 TCAATTACCC AGGAGTCATG AAGAGCCAAG GGAATATAAC AAGAAACAAT 

601 GTAGTCAAAT CTTATCAAGC ATGCGTACGC TATCTTAGAG ATGAACCCGC 

651 AGGACCTCAG GCGCGTCAAA TCGTTGCTTA TGGCTATTCT TTAGGAGCTA 

701 GTGTTCAAGC CGAAGCATTA AGTAAAGAGA TCGCAGACGG AAGTGATAGC 

751 GTCCGTTGGT TTGTCGTTAA AGATCGAGGA GCTCGCTCTA CAGGAGCCGT 

801 TGCTAAACAG TTTATTGGAA GTCTAGGAGT TTGGCTGGCG AATCTTACCC 

851 ATTGGAATAT TAATTCTGAA AAGAGAAGCA AGGACTTGCA TTGCCCAGAA 

901 CTCTTTATTT ATGGCAAGGA TTCCCAAGGT AATCTTATCG GGGATGGATT 

951 GTTCAAAAAA GAGACGTGCT TCGCAGCACC ATTTTTAGAT CCTAAAAACT 

1001 TGGAAGAGTG TTCAGGGAAG AAAATCCCTG TAGCTCAGAC CGGTCTAAGA 

1051 CACGATCATA TCCTTTCCGA TGATGTGATT AAAGAAGTTG CAGGTCATAT 

1101 TCA7LAGACAT TTCGATAATT A 

The PSORT algorithm predicts an inner membrane location (0.461). 

The protein was expressed in E.coli and purified as a GST-fusion product, as shown in Figure 61A. 
The recombinant protein was used to immunise mice, whose sera were used in Western blot (Figure 
61B) and FACS (Figure 61C) analyses. A his-tagged protein was also expressed. 

These experiments show that cp6854 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 
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Example 62 

The following C.pneumoniae protein (pid 4377101) was expressed <SEQ ID 123; cp7101>: 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



1 


MYSCYSKGIS 


51 


KAYRTTALQS 


X01 


YPLGPHRHNE 


151 


HTLALNPQTI 


201 


RFLKDLNDLI 


251 


LVKLSSSPGIi 


301 


TANDIIKSTIi 


351 


RVYHYLHAYE 


401 


LGWKSEDPHS 


451 


MRNPLNNQDS 


501 


FYTKQIPLYF 


551 


SINEFIRFLS 


601 


EALIiTRILEA 


651 


EPLTLTEKHP 


701 


FSIIAGSPIjF 


751 


ENFCNKYALQ 


801 


IYIRRLLYLM 


851 


TIPKMTLIiSS 


901 


APLLFADSNW 


951 


SRPWTLYANP 



HNYLIiHPMSR 
PLAAKNLNIA 
AQDREHLLKM 
LSTIHVRQAA 
SSGKLSRIVN 
KKAFSAANLI 
LHYYQLQEST 
EAKSAFIHDT 
LVSLVTHFVE 
QILTMDHMRF 
RSSYDAFIQE 
EFFTSTESEIi 
YQLPVPPSIL 
ENPHELAAFY 
REAWDNDWYS 
HWHDFHDFC 
VREVPYVSEQ 
ADLRH I YKGL 
PSIYFGFILN 
IDYGMPPPPG 



LDIFVFDSLI 
RKVANY I LAD 
LKALKENPKL) 
kTALFTYLRQ 
QRE I AVP I NIr 
ETLGDSEAQI 
VRAIFFKEGL 
QNPLLKAWEY 
EEVENIRILV 
RQELNKALYE 
FAHLY ANAPA 
LGKHAVI NLE 
NHLDQLSQTP 
ADALKDLPTG 
YTWLRDVWVK 
SDHSLTLPEL 
QLPEVLDNVS 
LMQSYQKIYT 
PGTTE I DLWK 
YRSRLPKEFF 



ANQDQNLLEE 
NGEIDTVKLV 
KESIKTLFVP 
DVGSCFATAP 
SGCIGELFKP 
QQLLSHQYLM 
FSKEQVAFST 
TLATLADASQ 
QQCEQTYHEA 
WDSAQEKAKK 
GFRILFTHGR 
KETSRLVHNI 
WVYVSGGTVD 
IKSYLEEGSH 
QHQDFIiQDTI 
YDKGSRFLSS 
SYLGISSRIT 
EEDTYIjRLTT 
FNYAGIrQGQP 



IFCSEDTVLF 
EAIHHLSQCT 
SYSTIQNLIR 
AILIHQEYPE 
LRILDLYPDP 
QKLQNVHETL 
QHPRELSEIQ 
PTISNHIRLA 
RSQLEYIEGR 
FLHLPEFLLS 
THPNTWSPIY 
TAMLHTDVFQ 
TLLLDYFESS 
SLLSSSPTHV 
LPQLSIYAFI 
LFTKDKTVAL 
YEKFRSLIEE 
AMRHHNLAYP 
LDNIQELFAT 



The cp7101 nucleotide sequence <SEQ ID 124> is: 



1 


ATGTATTCGT 


51 


TATGTCACGT 


101 


ATCAAAATCT 


151 


AAAGCCTACC 


201 


AAATATCGCC 


251 


TCGATACAGT 


301 


TATCCTTTAG 


351 


CCTTAAAATG 


401 


TCAAAACTCT 


451 


CAT ACAC TAG 


501 


TCAAGCAGCA 


551 


CCTGTTTTGC 


601 


CGATTCCTTA 


651 


AATCGTAAAC 


701 


TTGGAGAGCT 


751 


CTGGTTAAGC 


801 


CAATCTTATT 


851 


TCTCGCATCA 


901 


ACTGCTAACG 


951 


AGAAAGTACT 


1001 


AACAAGTGGC 


1051 


CGGGTATACC 


1101 


CCATGACACT 


1151 


CTCTTGCGGA 


1201 


TTAGGATGGA 


1251 


CTTTGTTGAA 


1301 


AACAGACCTA 


1351 


ATGCGCAACC 


1401 


CATGCGCTTC 


1451 


CTCAAGAAAA 


1501 


TTCTATACAA 


1551 


CATTCAAGAA 


1601 


TTCTTTTCAC 


1651 


TCGATTAATG 


1701 


GTCAGAACTT 


1751 


CTCGGCTCGT 


1801 


GAAGCTCTCC 


1851 


CTCCATCTTA 


1901 


TTTCTGGAGG 


1951 


GAACCTCTGA 


2001 


AGC TTTCT AC 



GTTACAGCAA 
TTGGATATTT 
TCTTGAGGAA 
GTACTACGGC 
CGTAAAGTCG 
AAAGCTTGTC 
GGCCTCATCG 
CTAT^AGCTC 
CTTTGTCCCT 
CATTGAATCC 
CTCACAGCGC 
TACGGCTCCT 
AAGATCTCAA 
CAAAGGGAAA 
ATTCAAGCCT 
TCTCCTCATC 
GAAACTCTTG 
ATATTTGATG 
ACATTATCAA 
GTACGAGCTA 
ATTCTCGACG 
ACTACTTACA 
CAAAATCCCT 
TGCTAGCCAA 
AAAGTGAAGA 
GAGGAAGTAG 
TCACGAAGCA 
CACTAAATAA 
CGTCAAGAAC 
GGCAAAGAAA 
AGC AAATTCC 
TTTGCTCATC 
GCATGGACGC 
AATTTATACG 
CTGGGGAAAC 
CCACAACATC 
TTACAAGAAT 
AACCACTTAG 
AACAGTGGAC 
CACTTACAGA 
GCAGACGCCC 



AGGAATATCC 
TTGTTTTCGA 
ATTTTCTGTT 
TCTACAATCC 
CAAATTATAT 
GAAGCCATTC 
CCATAATGAA 
TAAAGGAAAA 
TCATACTCTA 
ACAGACAATT 
TCTTCACCTA 
GCCATTCTCA 
TGATCTCATT 
TTGCGGTTCC 
TTAAGGATTC 
TCCAGGACTC 
GGGATTCTGA 
CAAAAACTAC 
ATCGACACTT 
TTTTCTTCAA 
CAACACCCCA 
TGCCTATGAA 
TACTGAAAGC 
CCTACCATCT 
CCCTCACAGT 
AAAACATCCG 
CGCTCCCAAC 
TCAAGACAGT 
TCAATAAAGC 
TTTCT AC ATC 
CTTATACTTT 
TCTATGCTAA 
ACCCATCCGA 
TTTTCTTTCT 
ATGC CGTGAT 
ACTGCCATGC 
TTTAGAAGCC 
ATCAGCTGTC 
AC TC TTCTTT 
AAAGCATCCT 
TTAAAGATCT 



CATAACTATC 
TTCTCTGATC 
CTGAAGACAC 
CCTCTAGCTG 
CTTAGCTGAC 
ACCATCTCTC 
GCTCAAGATC 
TCCTAAATTA 
CAATCCAAAA 
CTCTCTACGA 
CCTTCGGCAA 
TTCACCAAGA 
AGCAGTGGCA 
TATAAACCTT 
TAGATCTTTA 
AAAAAAGCCT 
AGCACAAATC 
AAAATGTCCA 
CTGCACTACT 
AGAAGGGTTG 
GAGAGCTCTC 
GAAGCAAAAT 
CTGGGAGTAT 
CAAACCATAT 
CTTGTATCTC 
AATTTTAGTC 
TAGAATATAT 
CAGATTTTGA 
TCTTTATGAG 
TTCCTGAATT 
CGTAGTTCTT 
TGCTCCCGCT 
ACACATGGTC 
GAATTCTTCA 
CAATTTAGAG 
TACACACGGA 
TATCAGCTTC 
ACAAACTCCC 
TGGATTATTT 
GAAAATCCTC 
CCCTACAGGA 



TTCTACATCC 
GCAAACCAGG 
AGTTTTATTT 
CTAAGAACCT 
AATGGGGAAA 
AC AATGTAC C 
GTGAACACCT 
AAAGAAAGCA 
CCTAATTCGC 
TTCATGTGCG 
GATGTAGGTT 
ATATCCAGAA 
AACTCTCTAG 
TCGGGATGCA 
TCCTGATCCT 
TTTCTGCTGC 
CAACAGTTGC 
TGAGAC CTTA 
ATCAGCTCCA 
TTCAGCAAAG 
AGAAATACAA 
CTGCTTTTAT 
ACTTTAGCGA 
CCGCCTTGCC 
TAGTTACACA 
CAACAATGTG 
TGAAGGGCGG 
CGATGGATCA 
TGGGATAGTG 
CTTACTTTCT 
ACGATGCCTT 
GGCTTCCGTA 
CCCCATCTAT 
CCTCCACAGA 
AAAGAAACAT 
TGTTTTCCAA 
CTGTGCCTCC 
TGGGTTTATG 
TGAAAGCTCA 
ATGAGCTTGC 
ATTAAAAGTT 
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2051 
2101 
2151 
2201 
2251 
2301 
2351 
2401 
2451 
2501 
2551 
2601 
2651 
2701 
2751 
2801 
2851 
2901 



ATCTAGAAGA 
TTCTCTATAA 
TTGGTACAGC 
ATTTCCTTCA 
GAGAATTTTT 
TGATTTCTGC 
GATCGCGTTT 
ATCTATATAC 
TTC AGAAC AA 
GGATTTCCTC 
ACCATCCCTA 
TAAAGGTCTC 
CGTACCTCCG 
GCTCCTTTGC 
CATCCTAAAT 
CAGGGCTGCA 
TCAAGACCCT 
GCCTCCAGGC 



AGGATC CCAC 
TCGCAGGATC 
TATACCTGGC 
AGATACTATA 
GTAACAAATA 
TCCGACCACT 
TCTAAGCTCC 
GCCGTCTTCT 
CAGCTTCCAG 
TCGTATTACC 
AAATGACCTT 
CTCATGCAAA 
CCTCACCACG 
TCTTTGCAGA 
CCAGGAACCA 
AGGACAGCCT 
GGACCCTCTA 
TACCGCAGCC 



TCTCTACTTA 
TCCTTTATTT 
TTCGTGATGT 
TTACCTCAGC 
TGCTTTGCAA 
CCTTGACTCT 
TTATTCACCA 
CTAC CTTATG 
AAGTCTTAGA 
TATGAGAAAT 
ACTCTCCTCA 
GTTATCAAAA 
GCAATGAGGC 
CAGTAACTGG 
CAGAGATCGA 
CTTGACAATA 
TGCAAATCCT 
GCCTCCCTAA 



GCTCATCACC 
CGGGAAGCTT 
CTGGGTGAAA 
TAAGTATCTA 
CATGTAGTTC 
TCCGGAGCTC 
AAGATAAGAC 
GTCCGTGAAG 
TAACGTCTCT 
TCCGCTCCCT 
GCAGACCTGA 
GATCTACACC 
ATCATAATCT 
CCTTCTATTT 
TCTTTGGAAA 
TCCAGGAGCT 
ATAGATTATG 
AGAATTTTTC 



CACCCACGTT 
GGGATAATGA 
CAACACCAAG 
TGCTTTCATA 
ATGACTTTCA 
TATGACAAAG 
CGTAGCTCTT 
TCC CTTATGT 
TCATATCTCG 
GATAGAGGAA 
GGCATATCTA 
GAAGAAGATA 
TGCCTATCCC 
ATTTTGGATT 
TTTAACTATG 
GTTCGCAACG 
GCATGCCACC 
TAG 



The PSORT algorithm predicts a cytoplasmic location (0.206). 



The protein was expressed in E.coli and purified as a GST-fusion (Figure 62 A) or his- tagged 
product. The proteins were used to immunise mice, whose sera were used in Western blot (Figure 
62B) and FACS (Figure 62C) analyses. 

This protein also showed good cross-reactivity with human sera, including sera from patients with 
pneumonitis. 

These experiments show that cp7101 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 



Example 63 

The following C.pneumoniae protein (pid 4377107) was expressed <SEQ ID 125; cp7107>: 

1 MSXVRNSALP LPCLSRSETF KKVRSHMKFM KVLTPWIYRK DLWVTAFLLT 

51 AIPGSFAHTL VDIAGEPRHA AQATGVSGDG KIVIGMKVPD DPFAITVGFQ 

101 YIDGHLQPLE AVRPQCSVYP NGITPDGTVI VGTNYAIGMG SVAVKWVNGK 

151 VSEIjPML PDT LDSVASAVSA DGRVIGGNRN INLGASVAVK WEDDVITQLP 

201 SLPDAMNACV NGISSDGSII VGTMVDVSWR NTAVQWIGDQ LSVIGTLGGT 

251 TSVASAISTD GTVIVGGSEN ADSQTHAYAY KNGVMSDIGT LGGFYSI*AHA 

301 VSSDGSVIVG VSTN SEHRYH AFQYADGQMV DLGTLGGPES YAQGVSGDGK 

351 VIVGRAQVPS GDWHAFLCPF QAPSPAPVHG GSTWTSQNP RGMVDINATY 

401 SSI»KNSQQQL QRLLIQHSAK VESVSSGAPS FTSVKGAISK QSPAVQNDVQ 

451 KGTFLSYRSQ VHGNVQNQQL LTGAFMDWKL ASAPKCGFKV ALHYG SQDAIi 

501 VERAAIiPYTE QGIjGSSVLSG FGGQVQGRYD FNLGETWLQ PFMG I QVLHL 

551 SREGYSEKNV RFPVSYDSVA YSAATSFMGA HVFASLSPKM STAATLGVER 

601 DLNSHIDEFK GSVSAMGNFV LENSTVSVLR PFASLAMYYD VRQQQLVTLS 

651 WMNQQPLTG TLSLVSQSSY NLSF* 

The cp7107 nucleotide sequence <SEQ ID 126> is: 

1 ATGAGTATAG TCAGAAATTC TGCATTGCCA CTTCCGTGTT TAAGCAGATC 

51 CGAAACCTTT AAAAAAGTTA GGTCGCATAT GAAATTTATG AAAGTCCTTA 

101 CTCCATGGAT TTATCGAAAA GATCTTTGGG TAACAGCATT CTTACTGACA 

151 GCAATTCCAG GATCTTTTGC ACATACTCTT GTTGATATAG CAGGAGAACC 

201 TCGGCATGCT GCTCAAGCAA CAGGAGTTTC TGGAGATGGT AAAATTGTTA 

251 TAGGAATGAA AGTTC CGGAT GATCCTTTTG CTATAACTGT AGGATTTCAA 

301 TATATTGATG GGCATTTGCA ACCCTTAGAG GCAGTACGTC CTCAATGCTC 

351 TGTATACCCT AATGGTATAA CCCCGGACGG AACGGTTATT GTGGGTACAA 

401 ACT ATGC CAT CGGGATGGGT AGTGTTGCTG TGAAATGGGT AAATGGCAAG 

451 GTTTCTGAAC TTCCCATGCT CCCTGACACC CTCGATTCTG TAGCATCGGC 

501 AGTTTCTGCA GATGGAAGAG TGATTGGAGG GAATAGAAAT ATAAATCTTG 

551 GCGCTTCTGT TGCTGTGAAA TGGGAGGACG ACGTGATTAC ACAACTTCCT 

601 TCTCTTCCTG ATGCTATGAA TGCTTGTGTT AACGGAATTT CTTCAGATGG 
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651 TTCTATAATT GTAGGAACCA TGGTAGACGT GTCATGGAGA AATACCGCAG 

701 TACAATGGAT CGGGGATCAG CTCTCTGTTA TTGGGACTTT AGGAGGAACT 

751 ACTTCTGTTG CTAGTGCAAT CTCAACAGAT GGCACTGTGA TTGTAGGAGG 

801 TTCTGAAAAT GCAGATTCTC AGACTCATGC CTATGCTTAT AAAAACGGTG 

851 TTATGAGCGA TATAGGGACC CTCGGAGGTT TTTATTCTTT AGCACATGCA 

901 GTATCTTCAG ATGGTTCTGT GATTGTAGGA GTATCCACGA ACTCTGAGCA 

951 TAGATATCAT GCATTCCAAT ATGCTGATGG ACAGATGGTA GATTTAGGAA 

1001 CTTTAGGAGG GCCTGAATCT TATGCTCAAG GTGTGTC TGG AGATGGAAAG 

1051 GTAATTGTGG GTAGAGCACA AGTACCATCT GGAGATTGGC ATGCGTTCCT 

1101 ATGTCCTTTC CAAGCTCCGA GCCCTGCTCC TGTCCATGGG GGAAGCACTG 

1151 TCGTAACTAG CCAGAATCCA CGTGGAATGG TAGATATCAA TGCTACGTAC 

1201 TCCTCTTTGA AAAATAG CC A ACAACAACTA CAAAGATTGC TTATCCAGCA 

1251 TAGTGCAAAA GTTGAAAGTG TATCCTCAGG AGCACCATCT TTTACAAGTG 

1301 TGAAAGGTGC GATCTCAAAA CAGAGCCCTG CAGTGCAAAA TGATGTACAG 

1351 AAAGGGACGT TTTTAAGTTA CCGTTCCCAA GTTCATGGAA ACGTGCAGAA 

1401 TCAGCAATTG CTCACAGGAG CTTTTATGGA CTGGAAACTC GCTTCAGCTC 

1451 CTAAATGCGG CTTTAAAGTA GCTCTCCACT ATGGCTCTCA AGATGCTCTC 

1501 GTAGAACGTG CAGCTCTTCC TTACACAGAA CAAGGCTTAG GAAGCAGTGT 

1551 CTTGTCAGGT TTTGGAGGAC AAGTTCAAGG ACGCTATGAC TTTAATTTAG 

1601 GAGAAACTGT TGTTCTGCAA CCCTTTATGG GCATTCAAGT TCTCCACCTA 

1651 AGTAGAGAAG GGTATTCTGA GAAGAATGTT CGATTTCCTG TAAGCTATGA 

1701 TTCTGTAGCC TACTCAGCAG CTACTAGCTT TATGGGTGCG CATGTATTTG 

1751 CCTCCCTAAG CCCTAAAATG AGTACAGCAG CAACTTTAGG TGTGGAGAGA 

1801 GATCTGAATT CACATATAGA TGAATTTAAG GGATCCGTCT CTGCTATGGG 

1851 AAAC TTTGTC TTGGAAAATT CTACAGTGAG TGTTTTAAGA CCTTTTGCTT 

1901 CTCTTGCTAT GTACTATGAC GTAAGACAAC AGCAACTCGT GACGTTGTCA 

1951 GTAGTTATGA ATCAACAACC CTTAACAGGC ACACTAAGCT TAGTAAGCCA 

2001 AAGTAGCTAT AATCTTAGCT TCTAA 



The PSORT algorithm predicts an inner membrane location (0.100). 

The protein was expressed in Exoli and purified as a GST-fusion (Figure 63 A) or his-tagged 
product. The proteins were used to immunise mice, whose sera were used in Western blot (Figure 
63B) and FACS (Figure 63C) analyses. 

These experiments show that cp7107 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 64 

The following C.pneumoniae protein (pid 4376467) was expressed <SEQ ID 127; cp6467>: 



1 MLRFFAVFIS TLWLITSG CS PSQSSKGIFV VNMKEMPRSL DPGKTRLIAD 

51 QTLMRHLYEG LVEEHSQNGE IKPALAESYT ISEDGTRYTF KIKNILWSNG 

101 DPLTAQDFVS SWKEILKEDA SSVYLYAFLP IKNARA I FDD TES PENLGVR 

151 ALDKRHLEIQ LET PCAHFLH FLTLPIFFPV HETLRNYSTS FEEMPITCGA 

201 FRPVSLEKGL RLHLEKN PMY HNKSRVKLHK IIVQFISNAN TAAILFKHKK 

251 LDWQGPPWGE PIPPEISASL HQDDQLFSLP GASTTWLLFN IQKKPWNNAK 

301 LRKALSLAID KDMLTKWYQ GLAEPTDHIL HPRLYPGTYP ERKRQNERIL 

351 EAQQLFEEAL DELQMTREDI* EKETLTFSTF SFSYGRICQM LREQWKKVLK 

401 FTIPIVGQEF FTIQKNFLEG NYSLTVNQWT AAFIDPMSYIi MIFANPGGIS 

451 PYHLQDSHFQ TLLIKITQEH KKHLRNQLII EALDYLEHCH ILEPLCHPNL 

501 RIALNKNIKN FNLFVRRTSD FRFIEKL* 



1 ATGCTCCGTT TCTTCGCTGT ATTTATATCA ACTCTTTGGC TC ATT AC C TC 

51 AGGATGTTCC CCATCCCAAT CCTCTAAAGG AATTTTTGTG GTAAATATGA 

101 AGGAAATGCC ACGCTCCTTG GATCCTGGAA AAACTCGTCT CATTGCAGAC 

151 CAAACTCTAA TGCGTCATCT ATATGAAGGA CTCGTCGAAG AACATTCCCA 

201 AAATGGAGAG ATTAAACCAG CCCTTGCAGA AAGCTACACC ATCTCCGAAG 

251 ACGGGACTCG GTACACATTT AAAATCAAAA ACATCCTTTG GAGTAACGGA 

301 GACCCTCTGA CAGCTCAAGA CTTTGTCTCC TCTTGGAAGG AAATCCTAAA 



A predicted signal peptide is highlighted. 



The cp6467 nucleotide sequence <SEQ ID 



128> is: 
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351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 



GGAAGATGCG 
CTCGGGCAAT 
GCTTTAGATA 
TTTCCTACAT 
TGCGAAACTA 
TTCCGCCCTG 
CCCTATGTAC 
AGTTTATCTC 
TTAGATTGGC 
AGCTTCTCTA 
CTACATGGTT 
TTACGCAAGG 
GGTATACCAA 
TTTATCCAGG 
GAGGCTCAAC 
CGAAGATCTA 
ACGGAAGGAT 
TTTACTATCC 
CCTAGAGGGG 
TTGATCCGAT 
CCCTATCACC 
TCAAGAACAT 
ACTATTTAGA 
CGAATTGCTT 
AACTTCAGAC 



TCCTCCGTAT 
CTTTGATGAT 
AGCGTCATCT 
TTCTTGACTC 
T AG C ACCTCT 
TGTCTCTAGA 
CATAATAAAA 
AAACGCTAAC 
AAGGACCTCC 
CATCAAGATG 
ACTCTTTAAT 
CATTGAGCCT 
GGTCTTGCAG 
GACCTATCCC 
AACTCTTTGA 
GAAAAGGAAA 
TTGCCAAATG 
CTATAGTAGG 
AACTATTCCC 
GTCTTATCTC 
TCCAAGATTC 
AAAAAACACC 
ACACTGTCAC 
TGAACAAAAA 
TTTCGTTTTA 



ATCTCTATGC 
ACTGAGTCTC 
CGAAATTCAG 
TTCCTATTTT 
TTTGAAGAGA 
AAAAGGCCTG 
GCCGTGTGAA 
AC TGC AGCCA 
TTGGGGAGAA 
ACCAGCTCTT 
ATACAAAAAA 
TGCAATAGAC 
AACCTACAGA 
GAACGGAAAA 
AGAAGCTCTA 
CTTTGACTTT 
CTAAGAGAAC 
CCAAGAGTTT 
TAACCGTGAA 
ATGATCTTTG 
ACACTTTCAA 
TACGAAATCA 
ATTCTCGAAC 
CATTAAAAAC 
TAGAAAAACT 



GTTTTTACCT 
CAGAAAATCT 
TTAGAAACTC 
TTTCCCTGTT 
TGCCCATTAC 
AGACTCCATC 
ACTACATAAA 
TTCTATTCAA 
CCTATCCCTC 
TTCTCTTCCG 
AACCTTGGAA 
AAAGATATGT 
TCATATCCTA 
GACAAAACGA 
GACGAACTTC 
CTCAACCTTT 
AATGGAAGAA 
TTCACAATAC 
CCAATGGACC 
CCAATC CTGG 
ACTCTTCTCA 
GCTTATTATT 
CACTATGTCA 
TTTAATCTTT 
ATAG 



ATCAAAAATG 
AGGAGTC CGA 
CCTGCGCGCA 
CATGAAACTC 
CTGCGGTGCT 
TAGAGAAAAA 
ATTATTGTAC 
ACATAAGAAA 
CAGAAATCTC 
GGCGCTTCGA 
CAATGCTAAA 
TAACCAAAGT 
CATC CAAG AC 
AAGAATTCTT 
AAATGACACG 
TCTTTTTC TT 
AGTCTTAAAA 
AAAAAAACTT 
GCAGCATTTA 
AGGAATTTCC 
TAAAGATCAC 
GAAGCCCTTG 
TCCAAATCTT 
TTGTTCGACG 



The PSORT algorithm predicts an outer membrane lipoprotein (0.790). 

The protein was expressed in E.coli and purified as a his-tag product and a GST-fusion protein, as 
shown in Figure 64A. The recombinant his-tag protein was used to immunise mice, whose sera were 
used in a Western blot (Figure 64B). The recombinant GST-fusion protein was also used to 
immunise mice, whose sera were used in a Western blot (Figure 64C) and for FACS analysis (Figure 
64D). 

These experiments show that cp6467 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 



Example 65 



The following C. pneumoniae protein (pid 4376679) was expressed <SEQ ID 129; cp6679>: 

1 MRKMIiVLLAS IiGLLSPTLSS CTKLGSSGSY HPKLYTSGSK TKGV I AML P V 

51 FHRPGKSLEP LPWNLQGEFT EE I SKRFYAS EKVFLIKHNA SPQTVSQFYA 

101 PIANRLPETI IEQFLPAEFI VATELLEQKT GKEAGVDSVT ASVRVRVPDI 

151 RHHKIAIilYQ EIIECSQPLT TLVNDYHRYG WNSKHFDSTP MGLMHSRLFR 

201 EWARVEGYV CANYS* 

A predicted signal peptide is highlighted. 



The cp6679 nucleotide sequence <SEQ ID 

1 ATGCGAAAAA TGTTGGTATT 

51 CCTATCCAGC TGCACTCACT 

101 TATACACTTC AGGGAGCAAA 

151 TTTCATCGCC CAGGAAAGAG 

201 AGAATTTACT GAAGAGATCA 

251 TCCTGATCAA GCACAATGCT 

301 CCGATTGCGA ATCGTCTACC 

351 AGAATTCATT GTTGCTACAG 

401 CAGGTGTCGA TTCTGTAACA 

451 CGTCATCATA AAATAGCTCT 

501 GCCTTTAACT ACCCTAGTCA 

551 AACATTTTGA TTCAACGCCC 



130>is: 

ATTGGCATCT TTAGGACTTC TATCCCCAAC 
TAGGCTCTTC AG G AAGTT AT CATCCTAAGC 
ACTAAAGGTG TGATTGCGAT GCTTCCTGTA 
TCTTGAACCT TTACCTTGGA ACCTCCAAGG 
GCAAAAGGTT TTATGCTTCG GAAAAGGTCT 
TCACCTCAGA CAGTCTCTCA GTTCTATGCT 
CGAAACAATT ATTGAGCAAT TTCTTCCTGC 
AACTGTTAGA ACAAAAGACA GGGAAAGAAG 
GCGTCTGTAC GTGTTCGCGT TTTTGATATC 
CATTTATCAA GAGATTATCG AATGCAGCCA 
ATGATTATCA TCGCTATGGC TGGAACTCAA 
ATGGGCTTAA TGCATAGCCG TCTTTTCCGC 
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601 GAAGTTGTTG CCAGAGTTGA GGGCTATGTT TGTGCTAACT ACTCGTAG 

The PSORT algorithm predicts an inner membrane location (0.149). 

The protein was expressed in E.coli and purified as a his-tag product (Figure 65A) and as a GST- 
fusion product (Figure 65B). The recombinant protein was used to immunise mice, whose sera were 
used in a Western blot (Figure 65C) and for FACS analysis. 

These experiments show that cp6679 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 66 

The following C.pneumoniae protein (pid 437 6890) was expressed <SEQ ID 131; cp6890>: 

1 MKQLLFCVCV FAMSCSAYA S PRRQDPSVMK ETFRNNYG 1 1 VSGQEWVKRG 

51 SDGTITKVIiK NGATLHEVYS GGLLHGEITL TFPHTTALDV VQIYDQGRLV 

101 SRKTFFVNGIi PSQEELFNED GTFVLTRWPD NNDSDTITKP YFIETTYQGH 

151 VIEGSYTSFN GKYSSSIHNG EGVRSVFSSN NILLSEETFN EGVMVKYTTF 

2 01 YPNRDPES IT HYQNGQPHGL RIiTYLQGG I P NTIEEWRYGF QDGTTIVFKN 

251 GCKTSEIAYV KGVKEGLELR YNEQEIVAEE VSWRNDFLHG ERKIYAGGIQ 

301 KHEWYYRGRS VSKAKFERLN AAG* 

A predicted signal peptide is highlighted. 

The cp6890 nucleotide sequence <SEQ ID 132> is: 

1 ATGAAACAAT TACTTTTCTG TGTTTGCGTA TTTGCTATGT CATGTTCTGC 

51 TTACGCATCC CCACGACGAC AAGATCCTTC TGTTATGAAG GAAACATTCC 

101 GAAATAATTA TG GC ATT ATT GTTTCCGGTC AAGAATGGGT AAAGCGTGGT 

151 TCTGACGGCA CCATCACCAA AGTACTCAAA AATGGAGCTA CCCTGCATGA 

201 AGTTTATTCT GGAGGCCTCC TTCATGGGGA AATTACCTTA ACGTTTCCCC 

251 AT AC C ACAGC ATTGGACGTT GTTCAAATCT ATGATCAAGG TAGACTCGTT 

301 TCTCGCAAAA CCTTTTTTGT GAACGGTCTT CCATCTCAAG AAGAGCTGTT 

351 CAATGAAGAT GGCACGTTTG TCCTCACACG ATGGCCGGAC AACAACGACA 

401 GTGATACCAT C AC AAAGCC T T AC TTCATAG AAACGACATA TCAAGGGCAT 

451 GTCATAGAAG GAAGTTATAC TTC CTTTAAT GGGAAATACT CCTCATCCAT 

501 CCACAATGGA GAGGGAGTTC GTTCTGTGTT CTCCTCCAAT AACATCCTTC 

551 TTTCTGAAGA GACCTTCAAT GAAGGTGTCA TGGTGAAATA T AC C AC ATT C 

601 TATCCGAATC GCGATCCCGA ATCGATTACT CATTATCAAA ATGGACAGCC 

651 TCACGGCTTA CGGCTAACAT ATCTACAAGG TGGCATCCCC AATACGATAG 

701 AGGAGTGGCG TTATGGCTTT CAAGACGGAA CGACCATCGT ATTTAAAAAT 

751 GGTTGTAAGA CATCTGAGAT CGCTTATGTT AAGGGAGTGA AAGAAGGTTT 

801 AGAACTGCGC TACAATGAAC AGGAAATTGT AGCTGAAGAA GTTTCTTGGC 

851 GTAATGATTT TCTGCATGGA GAACGTAAGA TCTATGCTGG AGGAATCCAA 

901 AAGCATGAAT GGTATTACCG CGGGAGATCT GTATCTAAAG CCAAATTCGA 

951 GCGGCTAAAT GCTGCAGGAT AG 

The PSORT algorithm predicts an outer membrane location (0.940). 

The protein was expressed in Kcoli and purified as a GST-fusion product, as shown in Figure 66A. 
The recombinant protein was used to immunise mice, whose sera were used in a Western blot 
(Figure 66B) and for FACS analysis. A his-tagged protein was also expressed. 

These experiments show that cp6890 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 67 



The following C.pneumoniae protein (pid 6172323) was expressed <SEQ ID 133; cp0018>: 
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1 MKTSVSMLLA LLC SGA5 S IV LHAATTPLNP 

51 AAGTTYSLTG EVLYIDPGKG GSITGTCFVE 

101 AGANIAVAHV QGSKNLSFTD FLSLVITESP 

151 DINTLVLTSN A S VEDGG V I K GNSCLIQGIK 

201 QGLTIENNLG TLKFNENKAV TSGGALDLGA 

251 AANGGAINCS GDLTFTDNTS LLLQENSTMQ 

301 VIGNTSGQKG GAISAASLKI LGGQGGALFS 

351 SLQLFTQGGD IVFEGNQVTT TAPNATTKRN 

401 IYFYDPITTN DTGASDNLRI NEVSANQKLS 

451 TSRINQPVTL VEG SLVLKQG VTLITQGFSQ 

A predicted signal peptide is highlighted. 

The cp0018 nucleotide sequence <SEQ ID 134> is: 



1 


ATGAAGACTT 


51 


CTCTATTGTA 


101 


TTATTGGGGA 


151 


GCTGCAGGAA 


201 


GGGGAAAGGT 


251 


ATCTTACATT 


301 


GCAGGTGCTA 


351 


CTTCACAGAT 


401 


TTACTACAGG 


451 


GATATAAACA 


501 


CGTGATTAAA 


551 


TTTTTGGACA 


601 


CAAGGACTTA 


651 


CAAAGCAGTG 


701 


TCACTGCGAA 


751 


GCTGCAAATG 


801 


TAACACTTCT 


851 


CTTTGTGTAG 


901 


GTGATAGGAA 


951 


TCTCAAGATT 


1001 


TGACTCATGC 


1051 


TCCTTGCAGC 


1101 


GGTCACTACA 


1151 


TCGAGAGCAC 


1201 


ATCTATTTCT 


1251 


CTTACGTATC 


1301 


TATTTTCTGG 


1351 


ACTTCGAGGA 


1401 


TAAACAGGGA 


1451 


CCACGCTTCT 



CAGTTTCTAT 
CTCCATGCCG 
GGGCAATACA 
CTACCTACTC 
GGTTCAATTA 
TTTAGGTAAT 
ATATCGCGGT 
TTCCTTTCTC 
AAAAGGTAGC 
CTCTAGTTCT 
GGAAACTCCT 
AAATACATCT 
CCATAGAGAA 
ACCTCAGGAG 
CCATGAGTTG 
GCGGAGCCAT 
T TGTT ACTTC 
CACAGGAACC 
ATACTTCAGG 
TTGGGAGGGC 
CACCCCTCTA 
TCTTCACTCA 
AC AGCTC CAA 
CGCGAAGTGG 
ATGATCCCAT 
AATGAGGTCA 
AGAGAGATTG 
TCAACCAGCC 
GTGACCTTGA 
TTTGGATCTG 



GTTGTTGGCC 
CAACCACTCC 
AATACTTTTT 
TCTCACAGGA 
CAGGAACTTG 
GGAAATACCC 
TGCTCATGTA 
TGGTGATCAC 
CTAGTCAGTT 
TACAAGCAAT 
GCTTGATTCA 
TCGAAAAAAG 
TAACTTAGGG 
GCGCCTTAGA 
ATATTTTCAC 
AAATTGCTCA 
AAGAAAATAG 
ATAAGCATTA 
ACAAAAAGGA 
AGGGAGGCGC 
GGAGGTGCCA 
AGGAGGGGAT 
ATGCTACCAC 
ACGGG AC TTG 
TACCACCAAC 
GTGCAAATCA 
TCGACAGCAG 
TGTCACTTTA 
TCACACAAGG 
GGGACCTCAT 



EDGFIGEGNT 
TAGDLTFLGN 
KSAVTTGKGS 
NSAIFGQNTS 
AS TFTANHEL 
DGGALCSTGT 
NNWTHATPL 
VIHLESTAKW 
GSIVFSGERL 
EPESTLLLDL 



CTGCTTTGCT 
ACTAAATCCT 
CTCCGAAATC 
GAGGTTCTGT 
CTTTGTAGAA 
TAAAGTTCCT 
CAAGGAAGTA 
AGAATCTCCA 
TAGGTGCAGT 
GCCTC TGTCG 
GGGAATCAAA 
GAGGGGCGAT 
ACGCTAAAGT 
TTTAGGAGCC 
AAAATAAGAC 
GGGGACCTTA 
CACAATGCAG 
CCGGTAGTGA 
GGAGCGATTT 
TCTCTTTTCT 
TTTTTATCAA 
ATCGTATTCG 
TAAGAGAAAT 
CTGCAAGTCA 
GATACGGGAG 
AAAGCTCTCG 
AAGCTATAGC 
GTAGAGGGGA 
ATTCTCGCAG 
TATAA 



NTFSPKSTTD 
GNTLKFLSVD 
LVSLGAVQLQ 
SKKGGAISTT 
IFSQNKTSGN 
ISITGSDSIN 
GGAIFINTGG 
TGLAASQGNA 
STAEAIAENL 
GTSL* 



CGGGGGCTAG 
GAAGATGGGT 
TACAACGGAT 
ATATAGATCC 
ACTGCTGGCG 
GTCGGTAGAT 
AGAATTTAAG 
AAATCCGCTG 
CCAACTGCAA 
AAGATGGTGG 
AATAGTGCGA 
CTCCACGACT 
TCAATGAAAA 
GCGTCTACAT 
TTCTGGGAAT 
CATTTACTGA 
GATGGTGGAG 
TTCTATCAAT 
CTGCAGCTTC 
AATAACGTAG 
CACAGGAGGA 
AGGGGAATCA 
GTAATTCACC 
AGGTAACGCT 
CAAGCGATAA 
GGATCTATAG 
TGAAAATCTT 
GCTTAGTACT 
GAGCCAGAAT 



The PSORT algorithm predicts outer membrane (0.935). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 67 A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
67B) and for FACS analysis. 

These experiments show that cp0018 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 



Example 68 

The following C.pneumoniae protein (pid 4376262) was expressed <SEQ ED 135; cp6262>: 

1 MRKLRILAIV LIALSIILIA GGWLLTVAI PGLSSVISSP AGMGACALGC 

51 VMLALGIDVL LKKREVPIVL ASVTTTPGTG SPRSGISISG ADST IRSL PT 

101 YLLDEGHPQS MRKLRILAIV LIVFSIILIA SGWLLTVAI PGLSSVISSP 

151 AGMGACALGC VMLALGIDVL LKKREVPIVL ASVTTTPGTG SPRSGISISG 

201 ADSTIRSLPT YPLDEGHPQS MRKLRILAIV LIVFSIILIA SGWLLTVAI 

251 PGLSSIISSP AEMGACALGC VMLALGIDVL LKKREVPIW PAPIPEEWI 
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10 



301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 



DDIDEESIRL 
GLEEKTKHQI 
TLVERKILTE 
ICRFTIIFEN 
ILHGNPFFSL 
KKWDLSGIPC 
NQKELEKAEQ 
QETVTPTVQG 
WEVKQEYGPK 
NKKEVQYAKF 
VFKGSLCCAL 
RFSNLENDIA 
GTPESEKVYF 
ALLQEELSIQ 



QQEAEAALAR 
RWRSSLKAM 
QLERNNLRKA 
HEHGVAKSLLi 
EDNKKTIMKE 
RDALSEISRD 
EYISSWERVK 
TTASSDLTDI 
KKEFQDQMGS 
RliKVLESDLE 
ASKAKPYFEE 
EERRLLKESK 
SMYLNYYNEE 
APSE* 



LPEEMSAFEG 
VPEFLDIRRI 
FSYLYQDSIF 
HKNAVLLEKV 
HAEMLESLSS 
EQWQKKAHLK 
KFEIERVQER 
LGRIEVSSRE 
LERFFTEH IE 
GILAQTESAE 
DPRFQDSDTQ 
QTFERAGLGV 
KRRAKTRLVE 



YIKWESHLE 
FEEEEFFFL S 
KKIIDNFEKL 
IYRSLQKSYR 
YRKVFLALSD 
HQESLYTQAR 
IRAIQKLYPN 
DNQNQESCVK 
ELEVLQKDYS 
SLLTQEELPI 
LRALTLRLQE 
LREIAVESTY 
MTQRYRDFKM 



NMKSLPYDGH 
ARKRL I DliAT 
AWKFMILSKS 
DIGMSSAKMK 
ENWDTPSDP 
DRLTDQSSKE 
ILEREEETTG 
VLRSHEVBMS 
KHLSYFKKVN 
LATRGALEKA 
AKASLEEEIK 
DLRSLTNTWE 
ALEAMQFNEE 



15 A predicted signal peptide is highlighted. 

The cp6262 nucleotide sequence <SEQ ID 136> is: 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



1 


ATG AG GAAAC 


51 


TTTGATTGCA 


101 


GTTCAGTCAT 


151 


GTGATGCTTG 


201 


TATAGTTCTC 


251 


GTGGTATTTC 


301 


TATCTCTTGG 


351 


TGCGATCGTT 


401 


TATTGCTTAC 


451 


GCAGGGATGG 


501 


CGATGTTCTT 


551 


CTACGACACC 


601 


GCTGATAGCA 


651 


TCCACAATCC 


701 


TTAGCATTAT 


751 


CCTGGATTAA 


801 


TTTGGGATGT 


851 


GAGAAGTCCC 


901 


GATGATATAG 


951 


TTTAGCAAGA 


1001 


TTGTCGAGAG 


1051 


GGGCTAGAAG 


1101 


GAAGGCTATG 


1151 


AAGAGTTCTT 


1201 


ACTTTAGTAG 


1251 


AAGGAAAGCG 


1301 


TTGATAACTT 


1351 


ATTTGTCGAT 


1401 


GAGCCTGTTA 


1451 


GTTTGCAAAA 


1501 


ATCTTGCACG 


1551 


AATGAAAGAA 


1601 


TATTTTTAGC 


1651 


AAGAAATGGG 


1701 


TTCTCGTGAT 


1751 


CCCTCTATAC 


1801 


AATCAGAAAG 


1851 


ACGGGTTAAA 


1901 


TTCAAAAGCT 


1951 


CAGGAGACTG 


2001 


AACAGATATT 


2051 


ATCAAGAGTC 


2101 


TGGGAAGTCA 


2151 


AATGGGTTCT 


2201 


TATTACAGAA 


2251 


AATAAGAAAG 


2301 


AGATTTAGAA 


2351 


CTCAAGAAGA 


2401 


GTTTTCAAAG 



TTCGTATTCT 
GGTGGTGTGG 
TTCTTCCCCG 
CTTTAGGGAT 
GCATCTGTAA 
TATTTCAGGA 
ACGAGGGACA 
CTCATAGTTT 
TGTAGCGATC 
GTGCCTGTGC 
CTGAAGAAAC 
AGGAACTGGC 
CCATACGTTC 
ATGAGGAAAC 
TTTGATTGCA 
GCTCGATCAT 
GTGATGCTTG 
TATAGTAGTT 
ATGAAGAGAG 
CTTCCTGAGG 
TCATTTGGAG 
AGAAAACGAA 
GTTCCAGAAT 
TTTTCTC TC A 
AGAGAAAAAT 
TTTTC TTATT 
CGAGAAGTTA 
TTACAATTAT 
CACAAGAATG 
AAGCTATAGA 
GCAACCCTTT 
CACGCAGAGA 
TCTATCTGAT 
ATTTGTCAGG 
GAACAGTGGC 
GCAAGCTAGG 
AGTTAGAGAA 
AAATTTGAGA 
TTATC CTAAT 
TGACTCCAAC 
TTAGGAAGAA 
TTGTGTAAAA 
AACAAGAGTA 
TTAGAGAGGT 
GGACTACTCT 
AGGTTCAATA 
GGGATTCTAG 
ACTTCCGATT 
GGAGTCTATG 



TGCGATCGTT 
TATTGCTTAC 
GCAGGGATGG 
CGATGTTCTT 
CTACGACACC 
GCTGATAGCA 
TCCACAATCC 
TTAGCATTAT 
CCTGGATTAA 
TTTGGGATGT 
GAGAAGTCCC 
AGCCCTAGAA 
TCTTCCTACG 
TTCGTATTCT 
AGTGGTGTGG 
TTCTTCCCCA 
CTTTGGGGAT 
CCCGCACCTA 
TATACGGCTG 
AGATGAGTGC 
AACATGAAAA 
ACATCAGATA 
TTTTAGATAT 
GCTCGCAAAC 
TTTAACAGAG 
TATATCAGGA 
GCATGGAAAT 
TTTTGAAAAT 
CAGTGTTACT 
GATATAGGCA 
TTTCTCTTTG 
TGCTTGAAAG 
GAGAACGTTG 
AATCCCCTGT 
AGAAGAAAGC 
GATCGTTTAA 
AGCTGAACAA 
TTGAGAGAGT 
ATCCTCGAGA 
TGTTCAAGGG 
TAGAGGTCTC 
GTCTTAAGAA 
TGGCCCTAAG 
TTTTTACAGA 
AAACACTTGT 
TGCGAAGTTT 
CTCAGACTGA 
CTTGCAACTC 
TTGCGCGCTA 



CTCATAGCTT 
TGTAGCGATC 
GTGCCTGTGC 
CTGAAGAAAC 
AGGAACTGGC 
CCATACGTTC 
ATGAGGAAAC 
TTTGATTGCA 
GTTCAGTCAT 
GTGATGCTTG 
TATAGTTCTC 
GTGGTATTTC 
TATCC CTTGG 
TGCGATCGTT 
TATTGCTTAC 
GCGGAGATGG 
CGACGTTCTT 
TTCCTGAAGA 
CAGCAGGAAG 
ATTTGAAGGT 
GCCTGCCTTA 
AGAGTCGTCA 
CAGAAGAATT 
GACTTATAGA 
CAACTTGAGC 
CTCAATTTTT 
TTATGATTTT 
CATGAACATG 
GGAGAAGGTA 
TGTCATCTGC 
GAAGATAATA 
TCTCAGTAGC 
TAGATACACC 
AGGGACGCGT 
ACATCTAAAG 
CAGACCAGAG 
GAGTACATAT 
ACAGGAGAGG 
GAGAAGAAGA 
ACGACGGCTT 
CAGTAGGGAG 
GTCATGAGGT 
AAAAAAGAAT 
GCATATTGAA 
CTTATTTTAA 
AGGTTGAAGG 
GAGTGCTGAG 
GGGGAGCCTT 
GCAAGCAAAG 



TGAGCATTAT 
CCTGGATTAA 
TTTGGGATGT 
GAGAAGTCCC 
AGCCCTAGAA 
TCTTCCTACG 
TTCGTATTCT 
AGTGGTGTGG 
TTCTTCCCCG 
CTTTAGGGAT 
GCATCTGTAA 
TATTTCAGGA 
ACGAGGGACA 
CTCATAGTTT 
TGTAGCGATC 
GTGCTTGTGC 
CTGAAGAAAC 
AGTCGTCATA 
CTGAAGCCGC 
TACATAAAAG 
TGATGGTCAT 
GATCTTCTTT 
TTTGAAGAAG 
TTTAGCTACT 
GCAATAATTT 
AAAAAAATTA 
GAGTAAATCA 
GTGTAGCAAA 
ATCTATAGGA 
AAAGATGAAA 
AAAAGACGAT 
TATAGGAAGG 
TAGCGATCCA 
TGTCTGAGAT 
CATCAAGAGT 
CTCTAAAGAA 
CTTCTTGGGA 
ATACGGGCAA 
AACCACAGGT 
CATCCGATTT 
GATAATCAGA 
AGAAATGAGC 
TTCAGGATCA 
GAGTTAGAAG 
AAAAGTAAAC 
TTTTAGAGTC 
AGTCTGTTAA 
AGAGAAAGCT 
CAAAACCCTA 
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2451 TTTTGAAGAG GATCCCAGAT TCCAAGATTC TGATACGCAA TTGCGAGCTC 

2501 TGACTCTAAG GTTACAGGAG GCTAAGGCAA GCCTGGAAGA AGAGATAAAG 

2551 AGATTTTCAA ATCTTGAGAA CGATATTGCA GAGGAAAGAC GCCTTCTTAA 

2601 AGAGAGCAAG CAGACGTTCG AAAGAGCAGG TTTAGGGGTT C TC C GAG AAA 

2651 TTGCAGTCGA GTCTACTTAT GATTTGCGTT CCTTAACAAA TACATGGGAA 

2701 GGGACCCCAG AGAGTGAGAA GGTC TATTTT AGCATGTATC TTAATTATTA 

2751 CAACGAAGAG AAACGTAGGG CTAAAACAAG ATTGGTTGAA ATGACACAGA 

2801 GGTATAGAGA TTTTAAAATG GCC TTGGAAG CTATGCAGTT TAATGAAGAA 

2851 GCCCTTTTGC AAGAGGAACT CTCTATTCAA GCTCCCAGTG AATAA 

The PSORT algorithm predicts inner membrane (0.660). 

The protein was expressed in E.coli and purified as a GST-fusion product (Figure 68 A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
68B) and for FACS analysis. 

These experiments show that cp6262 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 



Example 69 

The following C.pneumoniae protein (pid 4376269) was expressed <SEQ ID 137; cp6269>: 



1 


MYQENL RULE 


51 


EAEKAFLEQQ 


101 


VDDSERWNHK 


151 


TKFFLKKQEE 


201 


IESELVQCLE 


251 


ERIiKKSKTMD 


301 


LPEIDEIETC 


351 


YVQEYEVQLQ 


401 


FEIQGFNFMK 


451 


LIiELMYNCAD 


501 


SRHTTYQKLR 



RLLYNSVQKS 
K I LtiDYGKS I 
VLIQKIiEDDY 
VETRVKDLRA 
DQDIYWKEQD 
DRAKWHIENA 
LSLEELPLLT 
NLGFKLQGIS 
EDFKAAAKDL 
SYRDAKKKLC 
IAEELALELK 



YADRLFSYEK 
FWLNENDEIN 
EKLLEESSKE 
RYGGTVDPKQ 
VKDLARTQEIp 
EDSITWWTSQ 
TRELLTKSYL 
QRFGKKQDDF 
YIRSTAEQKM 
SLRLDEKELL 
KKI* 



The cp6269 nucleotide sequence <SEQ ID 138> is: 



1 


ATGTACCAGG 


51 


TCAAAAGAGC 


101 


TGCACGATAC 


151 


GAAGCTGAGA 


201 


AAAATCTATC 


251 


CTTGGAGTTG 


301 


GTTGACGACA 


351 


GGACGATTAT 


401 


CAAATAAGAA 


451 


ACAAAATTTT 


501 


TCTTAGAGCT 


551 


CTAAGAAGAA 


601 


ATCGAATCAG 


651 


AGAACAGGAT 


701 


ATATTGAAGC 


751 


GAGCGTTTAA 


801 


TGAAAATGCT 


851 


AGGATATGAA 


901 


CTACCTGAAA 


951 


TTTGCTTACG 


1001 


TTTGTTCGGA 


1051 


TATGTTCAGG 


1101 


AGGTATATCT 


1151 


AGGAACAGGT 


1201 


TTTGAAATAC 


1251 


T AAAGATC TT 


1301 


TGCCTTGCAT 


1351 


CTTCTTGAGT 



AGAATCTAAG 
TATGCGGATC 
TCCGCTGATT 
AAGCTTTCTT 
TTTTGGCTGA 
GGGTCTTAAT 
GTGAACGTTG 
GAGAAACTTC 
GCTTTTATCT 
TCCTGAAGAA 
CGATATGGAG 
AGTCGAATTG 
AGCTAGTACA 
GTCAAAGATC 
GAAGAGGGAA 
AGAAGTCAAA 
GAGGACAGTA 
AGCAAGACTG 
TAGATGAGAT 
ACCAGGGAAC 
AAC ACT ATT A 
AGTACGAGGT 
CAGAGATTCG 
TGCTTTGCAA 
AAGGATTCAA 
TATATAAGAA 
GGAGCTCTTC 
TGATGTACAA 



ATTGTTGGAA 
GGCTGTTTTC 
CCTTGGGAAG 
AGAGCAACAG 
ATGAGAACGA 
ACGGTGAGGA 
GAATCATAAG 
TAGAGGAAAG 
GACTTAGTAG 
ACAGGAGGAG 
GCACAGTAGA 
GAGGCTAGCT 
GTGTTTAGAA 
TAGCACGTAC 
GAAGCTGCCG 
AACTATGTTA 
TTACCTGGTG 
AAGATCTTAA 
TGAAACGTGT 
TCTTAACTAA 
AAAATGACTT 
TCAGCTGCAA 
GAAAGAAACA 
AAGAAACGAC 
TTTCATGAAA 
GTACAGCTGA 
CGTAGGTATC 
TTGTGCAGAC 



TKMVHDTPIjI 
LNDPWSWGLN 
STEANKKLLS 
DTEAKKKVEL. 
EEQDIEAKRE 
IEMKDMKARL 
KFKICSETLIi 
ANLiEEQVALQ 
NFDVPCMELF 
QKEIKKEEFY 



AGGCTTCTTT 
CTATGAAAAG 
AGGATAAGGA 
AAGATTCTCC 
TGAGATCAAT 
CTAGGAAAGT 
GTACTCATTC 
TTCAAAAGAG 
ATCGTCTTGA 
GTGGAGACTC 
TCCTAAGCAG 
TAGAAACCTT 
GATCAAGATA 
GCAAGAGCTC 
AAGACCTAAG 
GATAGGGCTA 
GACTAGTCAG 
AAGAAGATAT 
TTAAGCTTAG 
GTCCTACCTA 
CTGTGTTTGA 
AATCTAGGGT 
AGAGGATTTT 
TCAGAGAGCT 
GAAGATTTTA 
ACAAAAGATG 
ATGAGGAGGT 
AGTTATAGAG 



PWEEDKEKCA 
TVRTRKVFQE 
DLVDRLEDAK 
EASLETFLDS 
EAAEDLRSLN 
KILKEDITSV 
KMTSVFENNI 
KKRLREIiTQN 
RRYHEEVNKP 
QKKQQRHADR 



ATAATAGTGT 
ACAAAGATGG 
AAAATGTGCT 
TAGATTATGG 
TTAAACGATC 
ATTCCAAGAG 
AAAAACTCGA 
TCTACTGAAG 
AGATGCTAAG 
GCGTTAAGGA 
GATACGGAAG 
TTTAGATTCC 
TATATTGGAA 
GAGGAACAAG 
AAGTCTTAAT 
AATGGCATAT 
ATAGAAATGA 
AACAAGTGTT 
AGGAGC TTCC 
AAGTTTAAGA 
GAACAATATC 
TTAAGTTACA 
GCGAATCTAG 
CACTCAGAAT 
AGGCAGCCGC 
AACTTTGATG 
CAACAAGCCG 
ATGCTAAGAA 
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1401 AAAGCTTTGC TCTCTACGTC TTGATGAAAA AG AG TT ATT A CAAAAAGAAA 
1451 TCAAGAAAGA GGAATTTTAT CAAAAGAAAC AACAAAGGCA TGCAGATAGA 
1501 TCACGTCATA CTACGTATCA AAAG CTACGA ATTGCTGAAG AGCTTGCTCT 
1551 TGAGCTGAAG AAGAAAATCT AA 

The PSORT algorithm predicts cytoplasmic location (0.412). 

The protein was expressed in E.coli and purified as a GST-fusion product (Figure 69A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
69B) and forFACS analysis. 

These experiments show that cp6269 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 70 

The following ^pneumoniae protein (pid 437 6270) was expressed <SEQ ID 139; cp6270>: 

1 MKIPLRFLLI SLVPTLSMSN LLGAATTEEL SASNSFDGTT STTSFSSKTS 

51 SATDGTNYVF KDSWIENVP KTGETQSTSC FKNDAAAGDL NFLGGGFSFT 

101 FSNIDATTAS GAAIGSEAAN KTVTLSGFSA LSFLKS PAST VTNGLGAINV 

151 KGNL.SLLDND KVTilQDNFST GDGGAINCAG SLKIANNKSL SFIGNSSSTR 

201 GGAIHTKNLT LSSGGETLFQ GNTAPTAAGK GGAIAIADSG TLSISGDSGD 

251 IIFEGNTIGA TGTVSHSAID LGTSAKITAI* RAAQGHT I YF YDPITVTGST 

301 SVADALNINS PDTGDNKEYT GTIVFSGEKIj TEAEAKDEKN RTSKIiLQNVA 

351 FKNGTWIjKG DWLSANGFS QDANSKLIMD LGTSLVANTE SIELTNLEIN 

401 IDSLRNGKKI klsaataqkd iridrpwla isdesfyqng flnedhsydg 

451 ileldagkdi visadsrsid avqspygyqg kwtinwstdd kkatvswakq 

501 SFNPTAEQEA PIiVPNIiLWGS fidvrsfqnf ielgtegapy ekrfwvagis 

551 NVLHRSGREN QRKFRHVSGG AWGASTRMP GGDTLSLGFA QLFARDKDYF 

601 MNTNFAKTYA GSLRLQHDAS LYSWSILLG EGGLREILLP YVSKTLPCSF 

651 YGQLSYGHTD HRMKTESLPP PPPTLSTDHT SWGGYVWAGE LGTRVAVENT 

701 SGRGFFQEYT PFVKVQAVYA RQDSFVELGA ISRDFSDSHL YNLAIPLGIK 

751 LEKRFAEQYY HWAMYSPDV CRSNPKCTTT LLSNQGSWKT KGSNIjARQAG 

801 IVQASGFRSL. GAAAEIiFGNF GFEWRGS SRS YNVDAGSKIK F* 

A predicted signal peptide is highlighted. 



The cp6270 nucleotide sequence <SEQ ID 140> is: 

1 ATGAAGATTC CACTCCGCTT TTTATTGATA TCATTAGTAC CTACGCTTTC 

51 TATGTCGAAT TTATTAGGAG CTGCTACTAC CGAAGAGTTA TCGGCTAGCA 

101 ATAGCTTCGA TGGAACTACA TCAACAACAA GCTTTTCTAG TAAAACATCA 

151 TCGGCTACAG ATGGCACCAA TTATGTTTTT AAAGATTCTG TAGTTATAGA 

201 AAATGTACCC AAAACAGGGG AAAC TCAGTC TACTAGTTGT TTTAAAAATG 

251 ACGCTGCAGC TGGAGATCTA AATTTCTTAG GAGGGGGATT TTCTTTCACA 

301 TTTAGCAATA TCGATGCAAC CACGGCTTCT GGAGCTGCTA TTGGAAGTGA 

351 AGCAGCTAAT AAGACAGTCA CGTTATCAGG ATTTTCGGCA CTTTCTTTTC 

401 TTAAATCCCC AGCAAGTACA GTGACTAATG GATTGGGAGC TATCAATGTT 

451 AAAGGGAATT TAAGCCTATT GGATAATGAT AAGGTATTGA TTCAGGACAA 

501 TTTC TCAAC A GGAGATGGCG GAGCAATTAA TTGTGCAGGC TCCTTGAAGA 

551 TCGCAAACAA TAAGTCCCTT TCTTTTATTG GAAATAGTTC TTCAACACGT 

601 GGCGGAGCGA TTCATACCAA AAACCTCACA CTATCTTCTG GTGGGGAAAC 

651 TCTATTTCAG GGGAATACAG CGCCTACGGC TGCTGGTAAA GGAGGTGCTA 

701 TCGCGATTGC AGACTCTGGC ACCCTATCCA TTTCTGGAGA CAGTGGCGAC 

751 ATTATCTTTG AAGGCAATAC GATAGGAGCT ACAGGAACCG TCTCTCATAG 

801 TGCTATTGAT TT AGGAAC T A GCGCTAAGAT AACTGCGTTA CGTGCTGCGC 

851 AAGGACATAC GAT AT AC TTT TATGATCCGA TTACTGTAAC AGGATCGACA 

901 TCTGTTGCTG ATGCTCTCAA TATTAATAGC CCTGATACTG GAGATAACAA 

951 AGAGTATACG GGAACCATAG TCTTTTCTGG AGAGAAGCTC ACGGAGGCAG 

1001 AAGCTAAAGA TGAGAAGAAC CGCACTTCTA AATTACTTCA AAATGTTGCT 

1051 TTTAAAAATG GGACTGTAGT TTTAAAAGGT GATGTCGTTT TAAGTGCGAA 

1101 CGGTTTCTCT CAGGATGCAA ACTCTAAG TT GATTATGGAT TTAGGGACGT 

1151 CGTTGGTTGC AAACACCGAA AGTATCGAGT TAACGAATTT GGAAATTAAT 

1201 ATAGACTCTC TCAGGAACGG GAAAAAGATA AAACTCAGTG CTGCCACAGC 
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1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 
2051 
2101 
2151 
2201 
2251 
2301 
2351 
2401 
2451 
2501 



TCAGAAAGAT 
AGAGTTTTTA 
ATTCTTGAGT 
CAGTATAGAT 
TCAATTGGTC 
AGTTTTAATC 
TTGGGGTTCT 
GTACTGAAGG 
AATGTTTTGC 
GAGTGGAGGT 
CCTTGTCTCT 
ATGAATACCA 
CGATGCTTCC 
TCCGCGAGAT 
TATGGGCAGC 
TCTACCCCCC 
GATATGTCTG 
AGCGGCAGAG 
TGTTTACGCT 
ATTTTAGTGA 
TTAGAGAAAC 
TCCAGATGTT 
ACCAAGGGAG 
ATTGTTCAGG 
CGGGAACTTT 
ATGCGGGTAG 



ATTCGTATAG 
TCAAAATGGC 
TAGATGCTGG 
GCTGTACAAT 
TACTGATGAT 
CCACTGCTGA 
TTTATAGATG 
TGCTCCTTAC 
ATAGGAGCGG 
GCTGTAGTAG 
GGGTTTTGCT 
ATTTCGCAAA 
CTATACTCTG 
CCTGTTGCCT 
TTAGCTACGG 
CCCCCCCCGA 
GGCTGGAGAG 
GATTTTTCCA 
CGCCAAGATA 
TTCGCATCTT 
GGTTTGCAGA 
TGTCGTAGTA 
TTGGAAGACC 
CCTCAGGTTT 
GGCTTTGAAT 
CAAAATCAAA 



ATCGTCCTGT 
TTTTTGAATG 
GAAAGACATC 
CTCCGTATGG 
AAGAAAGCTA 
GCAGGAGGCT 
TTCGTTCCTT 
GAAAAGAGAT 
TCGTGAAAAT 
GTGCT AG C AC 
CAGCTCTTTG 
G AC CT AC G C A 
TGGTGAGTAT 
TATGTTTCCA 
CCATACGGAT 
CGCTCTCGAC 
CTGGGAACTC 
AGAGTACACT 
GCTTTGTAGA 
TATAACCTTG 
GCAATATTAT 
ACCCCAAATG 
AAAGGTTCGA 
TCGATCTTTG 
GGCGGGGATC 
TTTTAG 



TGTACTGGCA 
AGGACCATTC 
GTGATTTCTG 
CTATCAGGGA 
CGGTTTCTTG 
CCGTTAGTTC 
CCAGAATTTT 
TTTGGGTTGC 
CAAAGGAAAT 
GAGGATGCCG 
CGCGTGACAA 
GGATCTTTAC 
CCTTTTAGGA 
AGACTCTGCC 
CATCGCATGA 
GGATCATACT 
GAGTTGCTGT 
CCATTTGTAA 
ACTAGGAGCT 
CGATTCCTCT 
CATGTTGTAG 
TACGACTACC 
ACTTAGCAAG 
GGAGCTGCAG 
TTCTCGTAGC 



ATTAGCGATG 
CTATGATGGG 
CAGATTCTCG 
AAGTGGACGA 
GGCGAAGCAG 
CTAATCTTCT 
ATAGAGCTAG 
AGGCATTTCC 
TCCGTCATGT 
GGTGGTGATA 
AGACTACTTT 
GTTTGCAGCA 
GAGGGAGGAC 
GTGCTCTTTC 
AGAC CGAGTC 
TCTTGGGGAG 
TGAAAATACC 
AAGTCCAAGC 
ATCAGTCGTG 
TGGAAT CAAG 
CGATGTATTC 
CTACTTTCCA 
ACAGGCTGGT 
CAGAGCTTTT 
TATAATGTAG 



The PSORT algorithm predicts outer membrane (0.92). 

The protein was expressed in E.coli and purified as a GST-fusion product (Figure 70 A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot and for 
FACS analysis (Figure 70B). 

The cp6270 protein was also identified in the 2D-PAGE experiment (Cpn0013). 

These experiments show that cp6270 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 



Example 71 

The following ^pneumoniae protein (pid 4376402) was expressed <SEQ ID 141; cp6402>: 

1 MNVADLL SHIi ETLLSSKIFQ DYGPNGLQVG DPQTPVKKIA VAVTADLETI 

51 KQAVAAEANV LIVHHGIFWK GMPYPITGMI HKRIQLLIEH NIQLIAYHLP 

101 LDAHPTLGNN WKVALDLNWH DLKPFGSSLP YLGVQGSFSP IDIDSFIDLL 

151 SQYYQAPIiKG SALGGPSRVS SAALISGGAY REIiSSAATSQ VDCFITGNFD 

201 EPAWSTALES NINFLAFGHT ATEKVGPKS3J AEHLKSEFPI STTFIDTANP 

251 F* 

The cp6402 nucleotide sequence <SEQ ID 142> is: 

1 ATGAATGTTG CGGATCTCCT TTCTCATCTT GAGACTCTTC TCTCATCAAA 

51 AATATTTCAG GATTATGGAC CCAACGGACT TCAAGTTGGA GATCCCCAAA 

101 CTC CGGT AAA GAAAATCGCT GTTGCAGTTA CCGCAGATCT AGAAACCATA 

151 AAACAAGCTG TTGCGGCCGA AGCAAACGTT CTCATTGTAC ACCACGGAAT 

201 TTTTTGGAAA GGTATGCCCT ATCCTATTAC CGGCATGATC CATAAGCGCA 

251 TCCAATTACT AATAGAACAC AATATCCAAC TCATTGCCTA CCACCTTCCT 

301 TTGGATGCTC ACCCTACCTT AGGAAATAAC TGGAGAGTTG CCCTGGATCT 

351 AAATTGGCAT GACTTGAAGC CCTTTGGTTC TTCCCTCCCT TATTTAGGAG 

401 TGCAAGGCTC TTTCTCTCCT ATCGATATAG ATTCTTTCAT TGACCTGTTA 

451 TCTCAATATT ACCAAGCTCC CCTAAAAGGA TCTGCCTTGG GCGGCCCCTC 

501 TAGAGTCTCC TCAGCAGCTC TGATCTCAGG AGGAGCTTAT AGAGAACTCT 

551 CTTCGGCAGC CACGTCCCAA GTCGATTGCT TCATCAC AG G AAATTTTGAT 

601 GAACCTGCAT GGTCGACAGC TCTAGAAAGC AATATCAACT TCCTAGCATT 

651 TGGACATACA GCCACAGAAA AAGTAGGTCC AAAATCTCTT GCAGAGCATC 
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701 TAAAAAGCGA ATTTCCTATT TCCACAACCT TTATAGATAC GGCCAACCCC 
751 TTCTAA 

The PSORT algorithm predicts cytoplasmic (0.158). 

The protein was expressed in E.coli and purified as a GST-fusion product (Figure 71A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
71B) and for FACS analysis. 

These experiments show that cp6402 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 72 

The following C.pneumoniae protein (pid 4376520) was expressed <SEQ ID 143; cp6520>: 

1 MKHYLSFSPS ADFFSKQGAI ETQVLFGERV LVKGSTCYAY SQLFHNEIiLW 

51 KPYPGHSFRS TLVPCTPEFH IHPNVSWSV DAFLDPWGIP LPFGTIiLHVN 

101 SQNTVIFPKD ILNHMNTIWG SGTPQCDPRH LRRLNYNFFA ELIjIKDADIjIj 

151 LNF PYVWGGR SVHESLEKPG VDCSGFINIL YQAQGYNVPR NAADQYADCH 

201 WISSFENLPS GGLIFLYPKE EKRI SHVMIiK QDS STL I HAS GGGKKVEYFI 

251 LEQDGKFLDS TYLFFRNNQR GRAFFGI PRK RKAFL* 

The cp6520 nucleotide sequence <SEQ ID 144> is: 

1 ATGAAACACT ACCTATCATT TTCTCCTTCT GCTGATTTTT TCTCTAAACA 

51 GGGTGCTATT GAAACTCAAG TCCTTTTTGG AGAGCGCGTC TTAGTCAAAG 

101 GGAGCACCTG CTATGCATAT TCCCAATTAT TCCACAATGA GCTGTTATGG 

151 AAGCCCTATC CAGGTCATAG CTTTCGTTCT ACCCTAGTCC CCTGCACTCC 

201 TGAATTTCAT ATCCATCCAA ATGTTTCTGT GGTTTCTGTG GATGCATTTT 

251 TAGATCCTTG GGGGATCCCT CTTCCTTTTG GAACTTTACT CCATGTGAAT 

301 TCTCAAAATA CCGTTATTTT CCCTAAGGAT ATTCTCAATC ATATGAACAC 

351 CATCTGGGGC TCCGGCACAC CTCAATGCGA TCC TAGAC AT CTACGTCGTC 

401 TAAATTATAA CTTCTTTGCT GAACTTTTAA TTAAAGACGC AGACCTTTTA 

451 CTGAACTTTC CCTATGTATG GGGAGGACGG TCTGTACACG AAAGTCTGGA 

501 AAAGCCGGGT GTTGATTGTT CGGGATTTAT CAATATCCTT TACCAGGCAC 

551 AGGGATACAA CGTCCCTAGA AACGCTGCAG ATCAATATGC GGATTGTCAT 

601 TGGATCTCTA GCTTTGAGAA CCTTCCTTCT GGTGGGTTAA TATTTCTTTA 

651 CCCTAAAGAA GAAAAGCGTA TTTCTCATGT TATGTTGAAA CAGGATAGTT 

701 CCACCCTCAT TCATGCTTCT GGTGGAGGGA AAAAAGTGGA GTATTTCATT 

751 TTAGAACAAG ATGGGAAGTT TTTAGATTCG ACTTATCTAT TT T TT AG AAA 

801 TAATCAGAGG GGACGGGCAT TTTTTGGGAT CCCTAGAAAA AGAAAAGCCT 

851 TTCTGTAA 

The PSORT algorithm predicts cytoplasmic (0.265). 

The protein was expressed in E.coli and purified as a GST-fusion product (Figure 72A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
72B) and for FACS analysis. 

These experiments show that cp6520 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 73 

The following C.pneumoniae protein (pid 4376567) was expressed <SEQ ID 145; cp6567>: 

1 MTSPIPFQSS GDASFLAEQP QQLPSTSESQ LVTQLLTMMK HTQALSETVL 

51 QQQRDRIiPTA SIILQVGGAP TGGAGAPFQP G P ADDHHHP I PPPWPAQIE 

101 TEITTIRSEL QLMRSTLQQS TKGARTGVLV VTAILMTISL IAIIIIILAV 

151 LGFTGVLPQV ALLMQGETNL IWAMVSGSII CPIALIGTLG LILTNKNTPL 
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201 PAS* 

The cp6567 nucleotide sequence <SEQ ID 



146> is: 



1 ATGACCTCAC CGATCCCCTT TCAGTC TAGT GGCGATGCCT CTTTCCTTGC 

51 CGAGCAGCCA CAGCAACTCC CGTCTACTTC TGAATCTCAG CTAGTAACTC 

101 AATTGCTAAC CATGATGAAG CATACTCAAG CATTATCCGA AACGGTTCTT 

151 CAACAACAAC GCGATCGATT ACCAACCGCA TCTATTATCC TTCAAGTAGG 

201 AGGAGCTCCT ACAGGAGGAG CGGGTGCGCC TTTTCAACCA GGACCGGCAG 

251 ATGATCATCA TCATCCCATA CCGCCGCCTG TTGTACCAGC TCAAATAGAA 

3 01 ACAGAAATCA CCACTATAAG ATCCGAGTTA CAGCTCATGC GATCTACTCT 

3 51 ACAACAAAGC ACAAAAGGAG CTCGTACAGG AGTTCTAGTG GTTACTGCAA 

401 TCTTAATGAC GATCTCCTTA TTGGCTATTA TTATCATAAT ACTAGCTGTG 

451 CTTGGATTTA CGGGCGTCTT GCCTCAAGTA GCTTTATTGA TGCAGGGTGA 

501 AACAAATCTG ATTTGGGCTA TGGTGAGCGG TTCTATTATT TGCTTTATTG 

551 CGCTAATTGG AACTCTAGGA TTAATTTTAA CAAATAAGAA CACGCCTCTA 

601 CCGGCTTCTT AA 



The PSORT algorithm predicts inner membrane (0.694). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 73A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
7.3B) and for FACS analysis. 

These experiments show that cp6567 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 74 

The following C.pneumoniae protein (pid 4376576) was expressed <SEQ ID 147; cp6576>: 



1 MLIMRNKVIL QISILALIQT PLTLFSTEKV KEGHVWDSI TIITEGENAS 

51 NKHPLPKLKT RSGALFSQLD FDEDLRILAK EYDSVEPKVE FSEGKTNIAL 

101 HLIAKPSIRN IHISGNQWP EHKILKTLQI YRNDLFEREK FLKGLDDLRT 

151 YYLKRGYFAS SVDYSLEHNQ EKGHIDVLIK INEGPCGKIK QLTFSGISRS 

2 01 EKSDIQEFIQ TKQHSTTTSW FTGAGLYHPD IVEQDSLAIT NYLHNNGYAD 
251 A IVN SHYDLD DKGNILLYMD IDRGSRYTLG HVHIQGFEVL PKRLIEKQSQ 

3 01 VGPNDLYCPD KIWDGAHKIK QTYAKYGYIN TNVDVLFIPH ATRPIYDVTY 
351 EVSEGSPYKV GLIKITGNTH TKSDVILHET SLFPGDTFNR LKLEDTEQRL 
401 RNTGYFQSVS VYTVRSQLDP MGNADQYRDI FVEVKETTTG NLGLFLGFSS 
451 LDNIiFGGIEL SESNFDLFGA RNIFSKGFRC LRGGGEKLFL KANFGDKVTD 
501 YTLKWTKPHF LNTPWILGIE LDKSINRALS KDYAVQTYGG NVSTTYILNE 
551 HLKYGLFYRG SQTSLHEKRK FLLGPNIDSN KGFVSAAGVN LNYDSVDSPR 
601 TPTTGIRGGV TFEVSGLGGT YHFTKLSLNS SIYRKLTRKG I LK IKGEAQF 
651 IKPYSNTTAE GVPVSERFFL GGETTVRGYK SFIIGPKYSA TEPQGGLSSL. 
701 LISEEFQYPL IRQPNISAFV FLDSGFVGLQ EYKISIiKDLR SSAGFGLRFD 
751 VMNNVPVMLG FGWPFRPTET LNGEKIDVSQ RFFFALGGMF * 



1 ATGCTCATCA TGCGAAATAA AGTTATCTTG CAAATATCTA TTCTAGCGTT 

51 AATCCAAACC CCTTTAACTT TATTTTCTAC TGAA7AAGTT AAAGAAGGC C 

101 ATGTGGTGGT AGACTCTATC ACAATCATAA CGGAAGGAGA AAATGCTTCA 

151 AATAAACATC CCTTACCCAA ATTAAAGACC AGAAGTGGGG CTCTTTTTTC 

201 TCAATTAGAT TTTGATGAAG AC TTGAG AAT TCTAGCTAAA GAATACGACT 

251 CTGTTGAGCC T AAAG T AG AA TTTTCTGAAG GGAAAACTAA CATAGCCCTT 

301 C AC CT AAT AG CTAAACCCTC AATTCGAAAT ATTCATATCT CAGGAAATCA 

351 AGTCGTTCCT GAACATAAAA TTCTTAAAAC CCTACAAATT TACCGTAATG 

401 ATCTCTTTGA ACGAGAAAAA TTTCTTAAGG GTCTTGATGA TCTAAGAACG 

451 TATTATCTCA AGCGAGGATA TTTCGCATCC AGTGTAGACT ACAGTCTGGA 

501 ACACAATCAA GAAAAAGGTC ACATCGATGT TTTAATTAAA ATCAATGAAG 

551 GTCCTTGCGG GAAAATTAAA CAGCTTACGT TCTCAGGAAT CTCTCGATCA 

601 GAAAAATCAG ATATCCAAGA ATTTATTCAA ACCAAGCAGC ACTC T ACAAC 



A predicted signal peptide is highlighted. 



The cp6576 nucleotide sequence <SEQ ID 



148> is: 
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651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 
2051 
2101 
2151 
2201 
2251 
2301 
2351 



TACAAGTTGG 
AAGATAGCTT 
GCTATAGTCA 
TTACATGGAT 
TCCAAGGGTT 
GTCGGCCCCA 
TAAGATCAAA 
ACGTTCTCTT 
GAGGTAAGTG 
GAATACCCAT 
CAGGAGATAC 
AGAAATACAG 
ACTTGATCCT 
TCAAAGAAAC 
CTTGACAATC 
ATTTGGAGCT 
GTGGAGAACA 
TATACTTTGA 
AGGAATTGAA 
CTGTCCAAAC 
CACCTGAAAT 
AAAACGTAAG 
TCTCTGCTGC 
ACTCCAACTA 
GGGAGGAACT 
GAAAACTTAC 
ATTAAACCCT 
CTTCTTCCTA 
TCGGTCCAAA 
CTTATTTCAG 
TGCCTTTGTA 
TTTCGTTAAA 
GTAATGAATA 
AACCGAGACT 
TTGCTTTAGG 



TTTACTGGAG 
GGCAATTACG 
ACTCTCACTA 
ATTGATCGAG 
TGAGGTTTTG 
ATGATCTTTA 
CAAACTTATG 
CATCCCTCAC 
AAGGGTCTCC 
ACAAAATCTG 
ATTCAATCGC 
GCTACTTCCA 
ATGGGCAATG 
AACAACAGGA 
TTTTTGGAGG 
AGAAATATAT 
TCTATTCTTA 
AGTGGACCAA 
TTAGATAAAT 
CTATGGCGGG 
ACGGTCTATT 
TTCCTCCTAG 
AGGTGTCAAC 
CAGGGATTCG 
TATCATTTTA 
GCGTAAAGGT 
ATAGCAATAC 
GGTGGAGAGA 
ATACTCTGCT 
AAGAGTTTCA 
TTCTTAGACT 
AGATCTACGT 
ATGTTCCTGT 
TTGAATGGAG 
GGGCATGTTC 



CTGGACTCTA 
AATTACCTAC 
TGACCTTGAC 
GGTCGCGATA 
CCAAAACGCC 
TTGCCCCGAT 
CAAAGTATGG 
GCAACCCGCC 
TTATAAAGTT 
ACGTTATTTT 
TTAAAGCTAG 
AAGCGTTAGT 
CGGATCAATA 
AACTTAGGCT 
AATTGAACTA 
TTTCTAAAGG 
AAAGCCAACT 
AC CTC ATTTT 
CAATTAACAG 
AACGTCAGCA 
TTATCGAGGA 
GGCCAAATAT 
TT G AATTACG 
CGGGGGGGTG 
CAAAACTCTC 
ATTTTGAAAA 
TACAGCTGAA 
CTACAGTTCG 
ACAGAACCTC 
ATACCCTCTC 
CAGGTTTTGT 
AGTAGTGCTG 
TATGTTAGGA 
AAAAAATTGA 
TAA 



TCACCCAGAT 
ATAATAACGG 
GACAAAGGGA 
TACCTTAGGA 
TTATAGAAAA 
AAAATATGGG 
CTACATCAAT 
CTATTTATGA 
GGGTTAATTA 
ACACGAAACC 
AAGATACTGA 
GTCTATACAG 
CCGAGATATT 
TATTCTTAGG 
TCTGAAAGTA 
TTTTCGTTGT 
TCGGGGACAA 
CTAAACACTC 
AGCATTATCT 
CAACGTATAT 
AGTCAAACGA 
AGACAGCAAT 
ATTCTGTAGA 
ACTTTTGAGG 
TTTAAACAGC 
TCAAAGGGGA 
GGAGTTCCTG 
GGGATATAAA 
AGGGAGGACT 
ATCAGACAAC 
CGGTTTACAA 
GATT TGGTCT 
TTTGGTTGGC 
TGTATCTCAG 



ATTGTTGAAC 
GTACGCTGAT 
ATATTCTTCT 
CACGTCCATA 
GCAATCCCAA 
ATGGGGCTCA 
ACCAATGTAG 
TGTAACTTAT 
AAATTACTGG 
AGTCTCTTCC 
GCAACGTTTA 
TTCGTTCTCA 
TTTGTAGAAG 
ATTTAGTTCT 
ATTTTGATCT 
CTAAGAGGCG 
AGTCACAGAC 
CTTGGATTTT 
AAAGATTATG 
CTTGAACGAA 
GTTTACATGA 
AAAGGATTTG 
TAGTCCTAGA 
TTTCTGGTTT 
TCTATCTATA 
AGCTCAATTT 
TCAGTGAGCG 
TCCTTTATTA 
CTCTTCGCTC 
CTAATATTAG 
GAGTATAAGA 
GCGCTTCGAT 
CCTTCCGTCC 
CGATTCTTCT 



The PSORT algorithm predicts outer membrane (0.7658). 



The protein was expressed in Kcoli and purified as GST-fusion (Figure 74 A), his-tag and his- 
tag/GST-fusion products. The recombinant proteins were used to immunise mice, whose sera were 
used in a Western blot (Figure 74B) and for FACS analysis (Figure 74C). 

The cp6576 protein was also identified in the 2D-PAGE experiment (Cpn0300). 

These experiments show that cp6576 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 75 

The following C.pneumoniae protein (pid 4376607) was expressed <SEQ ID 149; cp6607>: 

1 MNKRQKDKLK ICVIISTLIL VG I FARA PRG DTFKTFLKSE EAIIYSNQCN 

51 EDMRKILCDA IEHADKEIFL RIYNLSEPKI QQSLTRQAQA KNKVTIYYQK 

101 FKIPQILKQA SNVTLVEQPP AGRKLMHQKA LSI DKKD AWL GSANYTNLSL 

151 RLDNNLILGM HSSELCDLII TNTSGDFSIK DQTGKYFVLP QDRKIAIQAV 

201 LEKIQTAQKT IQVAMFALTH SEIIQALHQA KQRGIHVDII IDRSHSKLTF 

251 KQLRQLNINK DFVSINTAPC TLHHKFAVID NKTLLAGSIN WSKGRFSLND 

301 ESLIILENLT KQQNQKLRMI WKDLAKHSEH PTVDDEEKEI IEKSLPVEEQ 

351 EAA* 



A predicted signal peptide is highlighted. 

The cp6607 nucleotide sequence <SEQ ED 150> is: 
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1 ATGAATAAAA GACAAAAAGA TAAATTAAAA ATCTGTGTTA TTATTAGCAC 

51 GTTGATTTTA GTAGGAATTT TTGCAAGAGC TCCTCGTGGT GACACTTTTA 

101 AGACTTTTTT AAAGTCTGAA GAAGCTATCA TCTACTCAAA TCAATGCAAT 

151 GAGGACATGC GTAAAATTCT ATGCGATGCT ATAGAACACG CTGATGAAGA 

201 GATCTTCCTA CGTATTTATA ACCTCTCAGA ACCCAAGATC CAACAGAGTT 

251 TAACTCGACA AGCTCAAGCA AAAAACAAAG TTACGATCTA CTATCAAAAA 

301 TTTAAAATTC CCCAAATCTT AAAGCAAGCC AGCAATGTAA CTTTAGTCGA 

351 GCAACCTCCA GCAGGGCGTA AACTGATGCA TCAAAAAGCT CTTTCCATAG 

401 ATAAGAAAGA TGCTTGGCTA GGATCTGCGA ACTACACCAA TCTTTCTCTA 

451 CGTTTAGATA ATAATCTCAT TCTAGGAATG CATAGCTCGG AGCTCTGTGA 

501 TCTCATTATC ACAAATACCT CTGGAGACTT TTCTATAAAG GATCAAACAG 

551 GAAAGTATTT TGTTCTTCCT CAAGATCGTA AAATTGCAAT ACAAGCTGTA 

601 CTCGAAAAAA TCCAGACAGC TC AG AAAAC C ATCCAAGTTG CTATGTTTGC 

651 TCTGACCCAC TCGGAGATTA TTCAAGCCTT ACATCAAGCA AAACAACGAG 

701 GAATCCATGT AGATATTATC ATTGATAGAA GTCATAGCAA ACTTACTTTT 

751 AAGCAATTAC GACAATTAAA TATCAATAAA G AC TTTGTTT CTATAAATAC 

801 CGCACCCTGT ACTCTTCACC ATAAGTTTGC AGTTATAGAT AATAAAACTC 

851 TACTTGCAGG ATCTATAAAT TGGTCTAAAG GAAGATTCTC CTTAAATGAT 

901 GAAAGCTTGA TCATACTGGA AAACCTGACC AAACAACAAA ATCAGAAACT 

951 TCGAATGATT TGGAAAGATC TAGCTAAGCA TTCAGAACAT CCTACAGTAG 

1001 ACGATGAAGA AAAAGAAATT ATAGAAAAAA GTCTTCCAGT AGAAGAGCAA 

1051 GAAGCAGCGT GA 

The PSORT algorithm predicts periplasmic (0.934). 

The protein was expressed in Rcoli and purified as a his-tagged product (Figure 75 A) and also as a 
GST-fusion. The GST-fusion protein was used to immunise mice, whose sera were used in a Western 
blot (Figure 75B) and for FACS analysis. 

These experiments show that cp6607 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 



Example 76 

The following C.pneumoniae protein (pid 4376624) was expressed <SEQ ID 151; cp6624>: 

1 MDAKMGYIFK VMRWIFCFVA CGITFGCTNS GFQNANSRPC ILSMNRMIHD 

51 CVERWGNRL ATAVLIKGSL DPHAYEMVKG DKDKIAGSAV IFCNGLGLEH 

101 TLSLRKHLiEN NPNSVKLGER LIARGAFVPL EEDGICDPHI WMDLSIWKEA 

151 VIEITEVLIE KF PEWS AE FK ANSEELVCEM SILDSWAKQC LSTIPENLRY 

201 LVSGHNAFSY FTRRYIjATPE EVASGAWRSR CISPEGLSPE AQISVRDIMA 

251 WDYINEHDV SWFPEDTLN QDALKKIVSS LKKSHLVRLA QKPLYSDNVD 

301 DNYFSTFKHN VCli I TEELiGG VALECQR* 

The cp6624 nucleotide sequence <SEQ ID 152> is: 

1 ATGGATGCGA AAATGGGATA TATATTTAAA GTGATGCGTT GGATTTTCTG 

51 TTTCGTGGCA TGTGGTATAA CTTTTGGATG TACCAATTCT GGGTTTCAGA 

101 ATGCAAATTC ACGTC CTTGT ATACTATCCA TGAATCGCAT GATTCATGAT 

151 TGTGTTGAAA GAGTCGTGGG GAATAGGCTT GCTACCGCTG TTTTGATCAA 

201 AGGATCCTTA GACCCTCATG CGTATGAGAT GGTTAAAGGG GATAAGGACA 

251 AGATTGCTGG AAGTGCCGTA ATTTTTTGTA ACGGCCTGGG TCTTGAGCAT 

301 ACATTAAGTT TGCGGAAGCA TTTAGAAAAT AATCCCAATA GTGTCAAGTT 

351 AGGGGAGCGG TTGATAGCGC GTGGGGCCTT TGTTCCTCTA GAAGAAGACG 

401 GTATTTGCGA TCCTCATATC TGGATGGATC TTTCTATTTG GAAGGAAGCT 

451 GTCATAGAAA TTACAGAAGT TCTCATTGAA AAGTTCCCTG AATGGTCTGC 

501 TGAATTTAAA GCAAATAGTG AGGAACTTGT TTGTGAAATG TCTATTTTAG 

551 ATTCTTGGGC GAAACAATGC TTGAGCACAA TTCCTGAAAA TTTACGGTAT 

601 CTTGTCTCAG GTCATAATGC GTTCAGTTAC TTTACACGTC GCTATTTAGC 

651 TACTCCTGAA GAAGTGGCTT CCGGAGCATG GAGGTCTCGT TGTATTTCTC 

701 CTGAGGGTCT ATCTCCAGAA GCTCAAATCA GTGTTCGTGA TATTATGGCG 

751 GTTGTAGATT ATATTAATGA GCATGATGTC AGTGTGGTTT TCCCTGAGGA 

801 TACTCTGAAC CAAGATGCGT TGAAAAAAAT TGTTTCTTCT CTGAAGAAAA 

851 GTCATTTAGT TCGTCTAGCT CAAAAACCAT TGTATAGTGA TAATGTGGAC 

901 GACAATTATT TTAGCACCTT TAAACATAAT GTCTGCCTTA TCACAGAAGA 
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951 ATTAGGAGGG GTGGCTCTTG AATGTCAAAG ATGA 

The PSORT algorithm predicts inner membrane (0.168). 

The protein was expressed in Kcoli and purified as a his-tag product (Figure 76 A). The recombinant 
protein was used to immunise mice, whose sera were used in a Western blot (Figure 76B) and for 
FACS analysis. 

The cp6624 protein was also identified in the 2D-PAGE experiment. 

These experiments show that cp6624 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 77 

The following ^pneumoniae protein (pid 4376728) was expressed <SEQ ED 153; cp6728>: 

1 MKSSVSWLFF SSIPIiFSSLS IVAAEVTLDS SNNSYDGSNG TTFTVFSTTD 

51 AAAGTTYSLL SDVSFQNAGA LGIPLASGCF LEAGGDLTFQ GNQHALKFAF 

101 INAGSSAGTV ASTSAADKNL LFNDFSRLSI ISCPSLLLSP TGQCALKSVG 

151 NLSLTGNSQI IFTQNFSSDN GGVINTKNFL LSGTSQFASF SRNQAFTGKQ 

201 GGWYATGTI TIENSPGIVS FSQNLAKGSG GALYSTDNCS ITDNFQVIFD 

251 GNSAWEAAQA QGGAICCTTT DKTVTLTGNK NLSFTMNTAb TYGGAISGLK 

301 VSISAGGPTL FQSNISGSSA GQGGGGAINI ASAGELALSA TSGDITFNNN 

351 QVTNGSTSTR NAINIIDTAK VTSIRAATGQ SIYFYDPITN PGTAASTDTL 

401 NLNLADANSE IEYGGAIVFS GEKLSPTEKA IAANVTSTIR QPAVIiARGDL 

451 VLRDGVTVTF KDLTQSPGSR ILMDGGTTLS AKEANLSLNG LAVNLS SLDG 

501 TNKAALKTEA ADKNISLSGT I AL I DTEG SF YENHNLKSAS TYPLLELTTA 

551 GANGTITLGA LSTLTLQEPE THYGYQGNWQ L SWAN AT S SK IGSINWTRTG 

601 YIPSPERKSN LPLNSIiWGNF IDIRSINQLI ETKSSGEPFE RELWLSGIAN 

651 FFYRDSMPTR HGFRHI SGGY ALGITATTPA EDQLTFAFCQ LFARDRNHIT 

701 GKNHGDTYGA SLYFHHTEGL FDIANFLWGK ATRAPWVLSE ISQIIPLSFD 

751 AKFSYLHTDN HMKTYYTDNS IIKGSWRNDA FCADLGASLP FVISVPYLLK 

801 EVEPFVKVQY IYAHQQDFYE RHAEGRAFNK SELINVEIPI GVTFERDSKS 

851 EKGTYDLTLM YILDAYRRNP KCQTSLIASD ANWMAYGTNL ARQGFSVRAA 

901 NHFQVNPHME IFGQFAFEVR SSSRNYNTNL GSKFCF* 

The cp6728 nucleotide sequence <SEQ ID 154> is: 

1 ATGAAGTCCT CTGTCTCTTG GTTGTTCTTT TCTTCAATCC CGCTCTTTTC 

51 ATCGCTCTCT ATAGTCGCGG CAGAGGTGAC CTTAGATAGC AGCAATAATA 

101 GCTATGATGG ATCTAACGGA ACTACCTTCA CGGTCTTTTC CACTACGGAC 

151 GCTGCTGCAG GAACTACCTA TTCCTTACTT TCCGACGTAT CCTTTCAAAA 

201 TGCAGGGGCT TTAGGAATTC CCTTAGCCTC AGGATGCTTC CTAGAAGCGG 

251 GCGGCGATCT TACTTTCCAA GGAAATCAAC ATGCACTGAA GTTTGCATTT 

301 ATCAATGCGG GCTCTAGCGC TGGAACTGTA GCCAGTACCT CAGCAGCAGA 

351 TAAGAATCTT CTCTTTAATG ATTTTTCTAG ACTCTCTATT ATCTCTTGTC 

401 CCTCTCTTCT TCTCTCTCCT ACTGGACAAT GTGCTTTAAA ATCTGTGGGG 

451 AATCTATCTC TAACTGGCAA TTCCCAAATT ATATTTACTC AGAACTTCTC 

501 GTCAGATAAC GGCGGTGTTA TCAATACGAA AAACTTCTTA TTATCAGGGA 

551 CATCTCAGTT TGCGAGCTTT TCGAGAAACC AAGCCTTCAC AGGGAAGCAA 

601 GGCGGTGTAG TTTACGCTAC AGGAACTATA ACTATCGAGA ACAGCCCTGG 

651 GATAGTTTCC TTCTCTCAAA AC CTAGCGAA AGGATCTGGC GGTGC TCTGT 

701 ACAGCACTGA CAACTGTTCG ATTACAGATA ACTTTCAAGT GATCTTTGAC 

751 GGCAATAGTG CTTGGGAAGC CGCTCAAGCT CAGGGCGGGG CTATTTGTTG 

801 CACTACGACA GATAAAACAG TGACTCTTAC TGGGAACAAA AACCTCTCTT 

851 TCACAAATAA T ACAG CATTG ACATATGGCG GAG C CATC TC TGGACTCAAG 

901 GTCAGTATTT CCGCTGGAGG TCCTACTCTA TTTCAAAGTA ATATCTCAGG 

951 AAGTAGCGCC GGTCAGGGAG GAGGAGGAGC GATCAATATA GCATCTGCTG 

1001 GGGAACTCGC TCTCTCTGCT ACTTCTGGAG AT ATT AC C TT CAATAACAAC 

1051 CAAGTCACCA ACGGAAGCAC AAGTACAAGA AACGCAATAA ATATCATTGA 

1101 TACCGCT7VAA GTCACATCGA TACGAGCTGC TACGGGGCAA TC TATCT ATT 

1151 TCTATGATCC CATCACAAAT CCAGGAACCG CAGCTTCTAC CGACACATTG 

1201 AACTTAAACT TAGCAGATGC GAACAGTGAG ATCGAGTATG GG GGTGC GAT 

1251 TGTCTTTTCT GGAGAAAAGC TTTCCCCTAC AGAAAAAGCA ATCGCTGCAA 
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1301 ACGTCACCTC TACTATCCGA CAACCTGCAG TATTAGCGCG GGGAGATCTT 

1351 GTACTTCGTG ATGGAGTCAC CGTAACTTTC AAGGATCTGA CTCAAAGTCC 

1401 AGGATCCCGC ATCTTAATGG ATGGGGGGAC TACACTTAGT GCTAAAGAGG 

1451 CAAATCTTTC GCTTAATGGC TTAGCAGTAA ATCTCTCCTC TTTAGATGGA 

1501 ACCAACAAGG CAGCTTTAAA AACAGAAGCT GCAGATAAAA ATATCAGCCT 

1551 ATCGGGAACG ATTGCGC TTA TTGACACGGA AGGGTCATTC TATGAGAATC 

1601 ATAAC TT AAA AAGTGCTAGT ACCTATCCTC TTCTTGAACT TACCACCGCA 

1651 GGAGCCAACG GAACGATTAC TCTGGGAGCT CTTTCTACCC TGACTCTTCA 

1701 AGAACCTGAA ACCCACTACG GGTATCAAGG AAACTGGCAG TTGTCTTGGG 

1751 CAAATGCAAC ATCCTCAAAA ATAGGAAGCA TCAACTGGAC CCGTACAGGA 

1801 TACATTCCTA GTCCTGAGAG AAAAAGTAAT CTCCCTCTAA AT AG C TTATG 

1851 GGGAAACTTT ATAGATATAC GCTCGATCAA TCAGCTTATA GAAACCAAGT 

1901 CCAGTGGGGA GCCTTTTGAG CGTGAGCTAT GGCTTTCAGG AATTGCGAAT 

1951 TTCTTCTATA GAGATTCTAT GCCCACCCGC CATGGTTTCC GCCATATCAG 

2001 CGGGGGTTAT GCACTAGGGA TCACAGCAAC AACTCCTGCC GAGGATCAGC 

2051 TTACTTTTGC CTTCTGCCAG CTCTTTGCTA GAGATCGCAA TCATATTACA 

2101 GGTAAGAACC ACGGAGATAC TTACGGTGCC TCTTTGTATT TCCACCATAC 

2151 AGAAGGGCTC TTCGACATCG CCAATTTCCT CTGGGGAAAA GCAACCCGAG 

2201 CTCCCTGGGT GCTCTCTGAG ATCTCCCAGA TCATTCCTTT ATCGTTCGAT 

2251 GCTAAATTCA GTTATCTCCA TACAGACAAC CACATGAAGA CATATTATAC 

2301 CGATAACTCT ATCATCAAGG GTTCTTGGAG AAACGATGCC TTCTGTGCAG 

2351 ATCTTGGAGC TAGCCTGCCT TTTGTTATTT CCGTTCCGTA TCTTCTGAAA 

2401 GAAGTCGAAC CTTTTGTCAA AGTACAGTAT ATCTATGCGC ATCAGCAAGA 

2451 CTTCTACGAG CGTCATGCTG AAGGACGCGC TTTCAATAAA AGCGAGCTTA 

2501 TCAACGTAGA GATTC CTATA GGCGTCACCT TCGAAAGAGA CTCAAAATCA 

2551 GAAAAGGGAA CTTACGATCT TACTCTTATG TATATACTCG ATGCTTACCG 

2601 ACGCAATCCT AAATGTCAAA CTTCCCTAAT AGCTAGCGAT GCTAACTGGA 

2651 TGGCCTATGG TACCAACCTC GCACGACAAG GTTTTTCTGT TCGTGCTGCG 

2701 AACCATTTCC AAGTGAACCC CCACATGGAA ATCTTCGGTC AATTCGCTTT 

2751 TGAAGTACGA AGTTCTTCAC GAAATTATAA TACAAACCTA GGC TCTAAGT 

2801 TTTGTTTCTA G 



The PSORT algorithm predicts inner membrane (0.187). 

The protein was expressed in E.coli and purified as a GST-fusion product (Figure 77A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
77B) and for FACS analysis. 

The cp6728 protein was also identified in the 2D-PAGE experiment. 

These experiments show that cp6728 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 78 

The following C.pneumoniae protein (pid 4376847) was expressed <SEQ ID 155; cp6847>: 



1 MFVMKKLVRIj CWLLSLLPN VLFS SDKLRE EGIKKMMDKL IEYHVDAQEV 

51 STDILSRSLS SYIQSFDPHK SYIiSNQBVAV FLQSPETKKR LLKNYKAGNF 

101 AIYRNINQIjI HESILRARQW RNEWVKNPKE LVL.EASSYQI SKQPMQWSKS 

151 LDEVKQRQRA LLLSYLSLHIi AG AS S SRYEG KEEQLAALCL RQIENHENVY 

201 LGINDHGVAM DRDEEAYQFH IRWKALAHS LDAHTAYFSK DEALAMRIQIi 

251 EKGMCGIGW LKEDIDGVW REIIPGGPAA KSGDLQLGDI I YRVDGKD I E 

301 HLSFRGVIaDC 3JRGGHG STW LDIHRGESDH TIALRREKIL LEDRRVDVSY 

351 EPYGDGVIGK VTLHSFYEGE NQVSSEQDLR RAIQGLKEKN LLGLVLDIRE 

401 NTGGFliSQAI KVSGLFMTNG WWSRYADG TMKCYRTVSP KKFYDGPLAI 

451 LVSKSSASAA EIVAQTLQDY GVALWGDEQ TYGKGTIQHQ TITGDASQDD 

501 CFKVTVGKYY SPSGKSTQLQ GVKSDILIPS LYAEDRLGER FLEHPLPADC 

551 CDNVIiHDPLT DLDTQTRPWF QKYYLPNLQK QETLWREMLP QLTKNSEQRL 

601 SENS^TFQAFL SQIKSSEKTD LSYGSNDLQL EESINILKDM ILLQQCRK* 



A predicted signal peptide is highlighted. 
The cp6847 nucleotide sequence <SEQ ID 



156> is: 
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1 ATGTTCGTAA TGAAAAAACT TGTCCGTCTA TGCGTAGTTC TTCTTTCTTT 

51 ACTTCCGAAT GTATTATTTT CTTCGGATCT TTTACGAGAA GAGGGCATCA 

101 AAAAGATGAT GGACAAGCTG ATCGAGTATC ATGTCGATGC TCAAGAGGTT 

151 TCTACGGATA TACTCTCGCG TTCTTTATCT AGTTACATTC AATCTTTTGA 

201 TCCTCATAAA TCTTATCTTT CAAACCAAGA GGTTGCAGTT TTTCTACAGT 

251 CTCCGGAAAC AAAGAAACGT CTCTTAAAGA ATTATAAGGC AGGCAACTTT 

301 GCTATTTATC GCAACATCAA TCAATTAATT CATGAGAGTA TTCTTCGTGC 

351 CAGGCAGTGG AGAAACGAAT GGGTTAAGAA TCCAAAAGAG CTTGTATTGG 

401 AGGCATCCTC ATATCAGATA TCGAAGCAAC CTATGCAATG GAGCAAATCT 

451 TTAGACGAAG TGAAGCAGAG ACAACGCGCT CTACTCCTTT CCTATCTTTC 

501 TTTACATCTT GCTGGAGCTT CTTCCTCTCG TTATGAGGGT AAAGAAGAGC 

551 AGCTTGCTGC TCTGTGTCTA CGTCAAATCG AGAACCATGA GAATGTATAT 

601 TTAGGTATCA ACGATCATGG TGTTGCTATG GATCGGGATG AAGAAGCCTA 

651 CCAATTCCAT ATCCGTGTTG TTAAAGCTTT AGCTCATAGC TTAGATGCAC 

701 ATACGGCGTA TTTCAGTAAG GACGAAGCGT TGGCGATGCG AATCCAACTA 

751 GAAAAAGGCA TGTGTGGAAT TGGTGTTGTT CTGAAGGAAG ATATTGATGG 

801 AGTTGTTGTT AGAGAAATCA TTCCTGGGGG ACCTGCGGCT AAATCTGGGG 

851 ATCTTCAGCT TGGAGATATC ATCTATCGGG TGGATGGCAA GGATATCGAG 

901 CATCTTTCTT TCCGCGGTGT TTTAGATTGT TTACGTGGAG GTCATGGCTC 

951 TACTGTAGTC TTAGATATCC ATCGTGGGGA GAGCGATCAT ACGATCGCCT 

1001 TGAGAAGGGA GAAAATCCTT TTAGAAGACC GTCGTGTGGA TGTTTCCTAT 

1051 GAGCCTTATG GAGATGGTGT GATTGGGAAA GTTACGTTAC ATTC TTTTTA 

1101 TGAAGGAGAA AATCAGGTTT CTAGTGAACA AGATCTACGT CGAGCGATTC 

1151 AGGGATTAAA GGAGAAGAAC CTTCTTGGAT TAGTTTTAGA TATCCGAGAA 

1201 AATACGGGTG GATTTTTATC TCAAGCGATC AAAGTTTCTG GTTTATTTAT 

1251 GACCAATGGC GTTGTGGTTG TATCTCGCTA TGCTGATGGT ACCATGAAGT 

1301 GCTACCGCAC AGTATCTCCT AAAAAATTCT ATGATGGTCC TTTGGCTATT 

1351 TTAGTATCTA AAAGTTCCGC ATCAGCAGCG GAGATTGTAG CACAAACTCT 

1401 CCAAGATTAT GGAGTTGCTT TAGTTGTTGG AGATGAGCAG ACCTATGGGA 

1451 AGGGAACGAT TCAGCATCAA ACAATTACTG GAGATGCCTC TCAGGACGAT 

1501 TGTTTTAAGG TTACTGTAGG GAAATATTAT TCCCCTTCTG GGAAATCGAC 

1551 TCAACTTCAG GGAGTAAAAT CCGATATTTT AATTCCTTCT CTCTATGCTG 

1601 AAGATCGTCT AGGAGAGCGT TTTCTAGAGC ATCCCTTACC TGCAGATTGC 

1651 TGTGATAATG TACTTCACGA TCCTCTCACG GAC TTGGATA CTCAAACACG 

1701 TCCTTGGTTT CAAAAATACT ATCTTCCTAA TCTACAAAAG CAAGAGACTC 

1751 TTTGGAGAGA GATGCTACCT CAGCTTACGA AAAACAGTGA GCAAAGGCTT 

1801 TCTGAGAATT CGAATTTTCA GGCATTTTTG TCGCAGATAA AATCATCTGA 

1851 AAAAACGGAC CTATCCTATG GTTCCAATGA TTTACAATTG GAAGAGTCGA 

1901 TAAACATTTT GAAGGACATG ATTTTATTAC AACAGTGTAG AAAATAA 

The PSORT algorithm predicts periplastic (0.932). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 78A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
78B) and for FACS analysis. 

These experiments show that cp6847 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 79 

The following C.pneumoniae protein (pid 4376969) was expressed <SEQ ID 157; cp6969>: 

1 MRLFSLGTIY LFFSLALSSC CGYSILNSPY HLSSLGKSIiL QERIFIAPIK 

51 EDPHGQLCSA LTYELSKRSF AISGRSSCAG YTHKVELLNG IDKNIGFTYA 

101 PNKLGDKTHR HFIVSNEGRL SLSAKVQLIN NDTQEVLIDQ CVARESVDFD 

151 FEPDLGTANA HEFALGQFEM HSEAIKSARR IliSIRLAETI AQQVYYDLF* 

A predicted signal peptide is highlighted. 

The cp6969 nucleotide sequence <SEQ ID 158> is: 

1 ATGAGATTGT TTTCTTTAGG CACGATTTAT CTTTTTTTTT CTCTAGCACT 
51 TTCGTCATGC TGTGGTTACT CTATTTTAAA CAGCCCGTAT CACTTATCGT 
101 CTTTAGGTAA G TCTTTATT A CAGGAAAGAA TTTTCATTGC TCCCATAAAA 
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151 GAAGATCCTC ATGGTCAGCT CTGCTCAGCT CTAACTTATG AGCTTAGTAA 

201 GCGTTCTTTT GCTATCTCTG GAAGGAGTTC TTGCGCAGGC TATACTCTTA 

251 AAGTAGAGCT TCTGAATGGT ATTGACAAGA ATATAGGTTT TACGTATGCC 

301 CCAAATAAAC TCGGAGATAA GACTCACAGG CATTTTATAG TCTC TAATGA 

351 AGGCAGACTA TCACTATCTG CAAAAGTACA GCTTATCAAT AATGACACTC 

401 AAGAAGTCCT TATAGACCAA TGTGTTGCTC GAGAGTCTGT AGACTTTGAC 

451 TTTGAGCCTG ACTTAGGAAC AG C AAACGC T CATGAATTTG CTTTAGGCCA 

501 ATTTGAAATG CATAGTGAAG CCATAAAAAG TGCTCGCCGT AT AC T ATCTA 

551 TACGCCTAGC CGAGACGATT GCTCAACAGG TATACTATGA CCTTTTTTGA 

The PSORT algorithm predicts inner membrane (0.126). 

The protein was expressed in E.coli and purified as a GST-fusion product (Figure 79 A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
79B) and for FACS analysis. 

These experiments show that cp6969 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 80 

The following C.pneumoniae protein (pid 4377109) was expressed <SEQ ID 159; cp7109>: 

1 MKKTCCQNYR SIGWFSWL FVLTTQTLFA GHFIDIGTSG LYSWARGVSG 

51 DGRWVGYEG GNAFKYVDGE KFLLEGLVPR SEALVFKASY DGSVIIGISD 

101 QDPSCRAVKW VNGALVDLGI FSEGMQSFAE GVSSDGKTIV GCLYSDDTET 

151 NFAVKWDETG MWLPNLPED RHSCAWDASE DGSVIVGDAM GSEEIAKAVY 

2 01 WKDGEQHLLS NIPGAKRSSA HAVSKDGSFI VGEFISEENE VHAFVYHNGV 

251 IKDIGTLGGD YSVATGVSRD GKVIVGHSTR TDGEYRAFKY VDGRMIDLGT 

301 LGGSASFAFG VSDDGKT I VG KFETELGECH AFIYLDD* 

A predicted signal peptide is highlighted. 

The cp7109 nucleotide sequence <SEQ ID 160> is: 

1 ATGAAAAAGA C ATGTTG C C A AAATTACAGA TCGATAGGCG TTGTGTTCTC 

51 TGTGGTACTT TTCGTTCTTA CAACACAGAC GCTGTTTGCA GGACATTTTA 

101 TTGATATTGG AACTTCTGGA TTATATTCTT GGGCTCGAGG TGTATCTGGA 

151 GATGGCCGCG TTGTCGTAGG TTATGAAGGT GGCAATGCAT TTAAATATGT 

201 TGATGGTGAG AAATTTCTGT TAGAAGGTTT GGTCCCGAGA TCCGAGGCCT 

251 TGGTATTTAA AGCTTCTTAT GATGGCTCTG TAATTATAGG AATCTCGGAT 

301 CAAGATCCGT CTTGCCGCGC TGTGAAGTGG GTAAACGGTG CACTTGTTGA 

351 TCTTGGAATA TTTTCTGAGG GAATGCAATC TTTTGCAGAG GGTGTTTCCA 

401 GTGATGGAAA GACGATTGTA GGGTGCCTAT ATAGTGATGA TACAGAGACA 

451 AACTTTGCTG TGAAGTGGGA TGAAACAGGA ATGGTTGTTC TCCCTAACTT 

501 ACCAGAAGAT CGACATTCTT GCGCTTGGGA TGCCTCTGAA GATGGCTCTG 

551 TGATTGTAGG GGACGCCATG GGTAGCGAGG AAATTGCCAA GGCAGTGTAC 

601 TGGAAGGACG GTGAACAACA TCTGCTTTCT AATATCCCAG GAGCTAAAAG 

651 ATCGTCAGCA CATGCAGTTT CTAAAGATGG ATCTTTTATC GTAGGCGAGT 

701 TCATCAGTGA AGAAAATGAA GTTCATGCCT TTGTTTATCA CAACGGTGTT 

751 ATCAAAGATA TCGGGACTTT AGGAGGAGAT TACTCTGTAG CAACTGGAGT 

801 TTCTAGGGAT GGTAAGGTCA TCGTGGGTCA TTCTACAAGA ACAGATGGTG 

851 AATAC CGTGC ATTTAAATAT GTGGATGGAA GAATGATAGA TTTGGGGACT 

901 TTAGGAGGTT CAGCATCTTT TGCTTTTGGT GTTTCTGACG ATGGCAAAAC 

951 AATCGTAGGA AAATTTGAAA CAGAGCTAGG AGAATGTCAT GCCTTTATCT 

1001 ACCTTGATGA TTAG 

The PSORT algorithm predicts outer membrane (0.887). 

The protein was expressed in E.coli and purified as a GST-fusion product (Figure 80A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
80B) and for FACS analysis. 



WO 02/02606 



PCT/IB01/01445 



-120- 



These experiments show that cp7109 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 81 

The following C. pneumoniae protein (pid 4377110) was expressed <SEQ ID 161; cp71 10>: 



1 MAAIKQILRS MLSQSSLWMV LFSLYSLSG Y CYVITDKPED DFHSSSAVKW 

51 DHWGKTTLSR LSNKKASAKA VSGTGATTVG FIKDTWSRTY AVRWNYWGTK 

101 ELPTSSWVKK SKATGISSDG SIIAGIVENE LSQSFAVTWK NNEMYLLPST 

151 WAVQ SKAYG I SSDGSVIVGS AKDAWS RTFA VKWTGHEAQV LPVGWAVKSV 

201 ANSVSANGSI IVGSVQDASG IIiYAVKWEGN TITHLGTLGG YSAIAKAVSN 

2 51 NGKVXVGRSE TYYGEVHAFC HKNGVMSDLG TLGGSYSAAK GVSATGKVIV 

3 01 GMSTTANGKL HAFKYVGGRM IDLGEYSWKE AC ANAVS I DG EIIVGVQSE* 



1 ATGGCAGCTA TAAAACAAAT TTTACGTTCT ATGCTATCTC AGAGTAGCTT 

51 ATGGATGGTC CTATTTTCAT TATATTCTCT ATCTGGTTAT TGCTATGTAA 

101 TTACAGACAA ACCAGAAGAT GACTTCCATT CTTCATCCGC AGTAAAATGG 

151 GATCATTGGG GAAAGACAAC TCTCTCAAGA TTATCAAATA AAAAAGCCTC 

201 TGCAAAAGCT GTTTCAGGAA CTGGTGCTAC AACTGTCGGC TTTATAAAAG 

251 ACACTTGGTC TCGAACATAC GCAGTAAGAT GGAATTATTG GGGGACCAAA 

3 01 GAACTCC C T A CCAGCTCATG GGTAAAAAAA TCAAAAGCAA CAGGAATCTC 

351 CTCTGATGGG TCTATAATCG CGGGGATTGT CGAGAATGAG CTTTCTCAAA 

401 GTTTCGCAGT CACATGGAAA AACAATGAAA TGTATTTGCT CCCTTCCACA 

451 TGGGCAGTGC AATCTAAAGC GTATGGAATT TCTTCTGATG GCTCTGTTAT 

501 TGTAGGGAGT G CT AAGGATG CTTGGTCGCG AACTTTCGCT GTGAAGTGGA 

551 CGGGACACGA GGCTCAGGTG TTACCAGTAG GCTGGGCTGT CAAATCTGTA 

601 GCGAATTCTG TATCTGCCAA TGGATCTATA ATTGTAGGGT CTGTACAAGA 

651 CGCCTCTGGA ATTCTTTATG CTGTAAAGTG GGAAGGGAAC ACTATTACAC 

701 ATCTAGGAAC TTTAGGAGGC TATTCTGCCA TTGCAAAAGC TGTATCCAAT 

751 AATGGCAAGG TCATTGTAGG GAGATCCGAA ACATATTATG GAGAGGTCCA 

801 TGCTTTCTGT CATAAGAATG GCGTCATGTC AGACCTCGGC ACCCTCGGAG 

851 GATCTTATTC TGCAGCTAAG GGAGTCTCTG CAACTGGAAA AGTTATTGTC 

901 GGTATGTCCA CAACAGCAAA TGGGAAATTG CATGCCTTTA AATATGTCGG 

951 TGGAAGAATG ATCGACTTAG GAGAGTATAG CTGGAAAGAA GCCTGTGCAA 

1001 ACGCTGTTTC TATTGATGGA G AAATT AT TG TTGGAGTCCA ATCAGAATAA 



The PSORT algorithm predicts outer membrane (0.827). 

The protein was expressed in Rcoli and purified as a GST-fusion product (Figure 81 A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
8 IB) and for FACS analysis. 

These experiments show that cp7110 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Figure 191 shows a schematic representation of the structural relationships between of cp7105, 
cp7106, cp7107, cp7108, cp7109 and cp7110, each of which is identified herein. These six proteins 
may be grouped in a new family of related outer membrane-associated proteins. These proteins have 
a repeat structure in common (c/. the pmp family). 

Example 82 

The following C.pneumoniae protein (pid 4377127) was expressed <SEQ ID 163; cp7127>: 



A predicted signal peptide is highlighted. 



The cp71 10 nucleotide sequence <SEQ ID 



162> is: 



1 MVFFRNSLLH LVALSGMLCC S SGVALT I AE KMASLEHSGR GADDYEGMAS 
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10 



15 



51 


FNANMREYSL 


101 


EELWAAEIRE 


151 


IKIATLSKFV 


201 


VAGVFSSRKD 


251 


IAGRVWIFGS 


301 


NAAFREDLiTK 


351 


EEGIENPTDK 


401 


GSQLNASIQI 


451 


MLLKKLDVPK 


501 


SWAGGTG I LE 


551 


QTPARIAWD 


601 


ITLETDITFD 


651 


DSHDGIPFLG 


701 


EEALLSSRPG 


751 


YDGC* 



QLSKLYEEAR 
KGGNIiEDYAIi 
VPKESFEDCL 
LEALPETAYI 
AGEVGELLKI 
DVSEESLGLR 
TVFWYNVKHS 
DTTVSSSAKD 
KMVRI EVLLF 
FLFKGSTGSS 
EMSIAVSSDK 
TTGKNHDDRJ? 
DIPGIGKLFG 
EREEYYQALA 



KLRASGTEDE 
WNHPETTIYN 
TQILSRLGIG 
GFVLNSNVDA 
YNFVQSESIR 
WPLQYQGRS 
DPQELAALLS 
GSVKYGNFIA 
ERKLAHEQKS 
IVPGYDLAYQ 
DKAQYNRAQY 
DVTRRNITNK 
MSSTSDSLTE 
ASEAAARAAH 



ALWKDL I RRI 
LVTDYGTEDS 
VRQVNSWIKE 
HTNQHVLKKF 
QEYRVIPLTK 
LFLSGTAALV 
QVHDVFSGEN 
DSKTGTLIMV 
GLNLLRLGEE 
FLMAQEDVRI 
GIMIKMLPVI 
VRIADGETVI 
MFVFITPKIL 
KKLEMFPASG 



GEVRGYLREI 
IYLIPQEIGA 
LYMMRKEGCS 
INPETTHVDV 
IDPGEMISIL 
QQAI/TLIREL 
KASVGAADGC 
VEKEVLPRIQ 
VCKKGCSPSV 
NASPSWTMN 
NVGEEDGKSY 
IGGLRCKQMS 
ENPVEQQERK 
VSLSQVERQE 



20 



25 



30 



35 



40 



45 



50 



55 



60 



A predicted signal peptide is highlighted. 

The cp7127 nucleotide sequence <SEQ ID 164> is: 

1 ATGGTTTTTT TCCGTAATTC TTTACTGCAT TTAGTTGCCC 

51 GCTCTGTTGT TCTTCTGGAG TGGCTTTAAC GATAGCCGAG 

101 CTTTAGAGCA CTCGGGGAGA GGAGCAGACG ATTATGAGGG 

151 TTTAATGC C A ATATGAGGGA GTATAGC CTT CAGCTGAGCA 

201 GGAAGCACGA AAGCTACGCG CTTC TGGAAC TGAGGATGAA 

251 AGGACTTAAT TCGACGGATT GGTGAGGTGC GAGGCTATCT 

301 GAGGAGCTTT GGGCTGCAGA AATTCGTGAG AAAGGGGGCA 

351 CTACGCCCTC TGGAATGACC CAGAGACTAC GATTTACAAT 

401 ATTACGGAAC CGAAGACTCT ATTTATTTGA TTCCTCAAGA 

451 ATTAAAATCG CAACC TTATC GAAATTTGTA GTTCCTAAAG 

501 AGACTGTCTC ACTCAGATCC TATCTCGCTT AGGTATTGGC 

551 TCAATTCTTG GATTAAGGAA CTTTATATGA TGCGTAAGGA 

601 GTTGCTGGAG TTTTTTCCTC CAGAAAAGAT TTAGAGGCGC 

651 AGCCTATATT GGTTTTGTAT TGAATTCGAA CGTAGATGCG 

701 AACATGTCTT AAAAAAGTTC ATTAACCCTG AAACAACGCA 

751 ATTGCAGGAC GTGTGTGGAT TTTTGGTTCT GCGGGGGAAG 

801 TCTGAAGATT TATAATTTTG TGCAGTCGGA GAGCATACGT 

851 GGGTGATTCC CTTAACTAAG ATCGATCCAG GGGAGATGAT 

901 AACGCAGCAT TTCGTGAGGA TCTGACTAAA GATGTTAGTG 

951 AGGCCTTCGT GTAGTTCCTT TACAGTATCA AGGGCGTTCG 

1001 GTGGAACCGC GGCGTTAGTG CAGCAAGCGC TGACTCTCAT 

1051 GAAGAAGGGA TTGAGAACCC TACGGATAAA ACAGTATTTT 

1101 CAAGCACTCC GATCCCCAAG AGTTGGCGGC ATTGCTTTCC 

1151 ATGTCTTCTC TGGCGAGAAT AAGGCGAGTG TCGGAGCTGC 

1201 GGGTCGCAAT TAAATGCCTC GATCCAAATT GATACTACAG 

1251 TGCGAAAGAT GGCTCAGTGA AGTACGGAAA CTTCATCGCG 

1301 CAGGAACTCT GATTATGGTG GTTGAGAAAG AAGTTCTTCC 

1351 ATGCTACTTA AGAAACTAGA TGTCCCTAAA AAGATGGTCC 

1401 GCTGTTATTT GAAAGAAAAT TGGCACATGA GCAGAAATCT 

1451 TTCTACGTCT TGGTGAGGAA GTTTGTAAAA AAGGGTGCAG 

1501 TCTTGGGCCG GGGGTACTGG CATACTAGAA TTTTTATTTA 

1551 GGGATCTTCG ATAGTTCCTG GTTATGATCT CGCCTATCAA 

1601 CTCAAGAGGA CGTTCGGATT AATGCGAGTC CTTCTGTAGT 

1651 CAAACCCCAG CACGGATTGC TGTTGTTGAT GAAATGTCAA 

1701 TTCAGATAAA GATAAAGCGC AATACAATCG TGCGCAGTAC 

1751 TAAAAATGCT CCCCGTAATT AATGTGGGAG AGGAAGACGG 

1801 ATTACTTTAG AGACAGACAT CACCTTTGAT ACTACGGGAA 

1851 TGATCGTCCT GATGTTACAA GGCGTAATAT TACTAATAAG 

1901 CTGACGGAGA GACTGTGATT ATTGGAGGTT TGCGTTGCAA 

1951 GATTCTCATG ATGGCATTCC TTTCCTTGGA GACATTCCTG 

2001 GTTATTTGGA ATGAGTTCCA CATCAGACAG TCTCACGGAG 

2051 TTATCACTCC GAAGATC CTA GAAAATCCTG TAGAGCAACA 

2101 GAAGAAGCTT TACTCTCTTC GCGCCCTGGA GAGAGAGAAG 

2151 GGCTTTAGCA GC T AGTGAGG CTGCAGCACG AGCAGCTCAT 

2201 AGATGTTCCC GGCATCAGGA GTATCTTTAT CTCAGGTAGA 

2251 TACGATGGCT GCTAG 

The PSORT algorithm predicts periplasmic (0.920). 



TATC CGGAAT 
AAGATGGCTT 
GATGGCTTCG 
AGTTGTATGA 
GCTCTGTGGA 
TCGAGAGATC 
ATCTCGAGGA 
CTTGTTACCG 
AATCGGAGCG 
AGTCTTTCGA 
GTGCGTCAGG 
GGGCTGCAGT 
TCCCAGAAAC 
CATACCAATC 
TGTAGATGTG 
TCGGCGAGCT 
CAAGAGTATC 
TTCCATTCTC 
AAGAATCTTT 
TTGTTTTTAA 
TCGAGAGCTT 
GGTATAACGT 
CAAGTCCATG 
AGATGGATGT 
TAAGTTCTTC 
GATTCTAAGA 
ACGTATTCAG 
GTATCGAGGT 
GGGTTAAATC 
TCCTTCTGTG 
AAGGAAGTAC 
TTTTTAATGG 
TACTATGAAC 
TAGCGGTGTC 
GGTATCATGA 
AAAAAGTTAC 
AAAATCATGA 
GTGCGCATTG 
ACAGATGTCA 
GTATAGGGAA 
ATGTTTGTAT 
AGAACGTAAA 
AATACTATCA 
AAAAAATTAG 
GAGGCAAGAA 
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The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 82 A) and also in 
his-tagged form. The recombinant proteins were used to immunise mice, whose sera were used in a 
Western blot (Figure 82B) and for FACS analysis. 

These experiments show that cp7l27 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 83 

The following C.pneumoniae protein (pid 4377133) was expressed <SEQ ID 165; cp7133>: 



1 MQPFIFTLLC LTSLVSLVAF DAANARKRCA CAQTIERGEN FFSIKRSACA 

51 EIEYQEKSRH ASAIERISKD KGKVTPKQ1A KVATKKKQRY RLLQVPFSRP 

101 PNNSRYNLYA LLSEPPECYS DTASWYAIFI RLLRRAYVDT GNVPPGSEYA 

151 IANALISNKQ EILERGAQLG PDVIETLTLP EEQAEIFYKM LKGSSNSQSL 

201 LNFLHYEEKS LGHCKLNIi IF MDPLLLEAVL DHPDAYRETS LLRDG I WEAV 

251 KRQEHAIQEH GQAAALEIjFK TRTDFRLELR DKMQLLLSRY DLLPLLNKKM 

301 FDYTLGSAGD YLFLVDPDTK AISRCRCPSK SIKL 



1 ATGCAACCTT TTATCTTTAC TTTACTGTGC TTGACATCTT TGGTTTCTTT 

51 AGTCGCCTTT GATGCTGCGA ATGCTCGTAA ACGTTGTGCC TGTGCTCAAA 

101 CTATAGAACG TGGAGAGAAC TTCTTTTC C A TAAAACGCTC TGCTTGTGCT 

151 GAAATCGAAT ATCAAGAAAA ATCTCGCCAC GCCTCAGCAA TTGAAAGAAT 

2 01 CTCAAAAGAT AAAGGCAAAG TCACTCCAAA GCAGATTGCG AAAGTAGC T A 
251 CTAAGAAAAA GCAAAGATAC CGTTTATTGC AGGTTCCTTT TTCAAGGCCT 

3 01 CCGAATAACT CAAGGTATAA CCTCTATGCT TTGCTTAGTG AACCTCCCGA 
351 ATGCTATAGC GATACAGCAT CATGGTATGC TATTTTTATT CGGTTACTTC 
401 GACGTGCTTA TGTAGACACG GGAAATGTAC CTCCTGGATC TGAGTATGCC 
451 ATCGCTAATG CTTTGATAAG TAACAAACAA GAGATTTTAG AGAGGGGAGC 
501 GCAGCTTGGA CCCGATGTTA TTGAAACTCT AACATTGCCT GAGGAACAAG 
551 CCGAGATTTT TTATAAAATG CTCAAAGGGT CGTCAAACTC TCAGTCGCTA 
601 CTGAATTTTC TGCATTATGA AGAGAAAAGC TTAGGCCACT GTAAGC TAAA 
651 TCTGATCTTC ATGGATCCCC TACTGTTAGA AGCTGTTCTA GATCATCCCG 
701 ATGCTTATAG GGAAACGTCG CTCCTGCGCG ATGGCATTTG GGAAGCGGTG 
751' AAGCGTCAAG AACATGCCAT CCAAGAACAT GGCCAGGCAG CTGCTTTGGA 
801 GCTTTTTAAA ACACGCACCG ACTTCCGCCT GGAGCTG CG A GATAAGATGC 
851 AGTTACTTCT AAGTCGATAC GATTTGCTCC CCTTATTAAA TAAAAAAATG 
901 TTCGACTACA CCTTAGGAAG TGCCGGAGAT TACTTATTTT TGGTAGACCC 
951 AGATACTAAG GCAATTTCTC GATGTCGCTG CCCTTCAAAG AGTATTAAAT 

1001 TATAA 



The PSORT algorithm predicts outer membrane (0.92). 

The protein was expressed in E.coli and purified as a GST-fusion product (Figure 83 A) and also in 
his-tagged form. The recombinant proteins were used to immunise mice, whose sera were used in a 
Western blot (Figure 83B) and for FACS analysis. 

These experiments show that cp7133 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 84 

The following C.pnewnoniae protein (pid 4377222) was expressed <SEQ ID 167; cp7222>: 



A predicted signal peptide is highlighted. 



The cp7133 nucleotide sequence <SEQ ID 



166>is: 



1 MNRRDMVITA VWNAIIxLVA LFVTSKRIGV KDYDEGFRNF ASSKVTQAW 
51 SEEKVIEKPV VAEVPSRPIA KETLAAQFIE SKPVTVTTPP VPWSETPEV 
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101 PTVAVPPQPV RETVKEEQAP 
151 DLTTTQLKIG QVIKVPTSQD 
201 L RNH I RL DDL LKMNDLDEYK 



YATWVKKGD 
VSNEKTPQTQ 
ARRLK PGDQL 



FLERIARANH TTVAKLMQIN 
TANPENYYIV QEGDSPWTIA 
RIR* 



A predicted signal peptide is highlighted. 



The cp7222 nucleotide sequence <SEQ ED 



168> is: 



1 ATGAATCGTA GAGACATGGT AATAACAGCT GTCGTAGTGA ATGCTATATT 

51 GCTTGTGGCT CTTTTCGTCA CATCAAAGCG TATTGGCGTC AAGGACTATG 

101 ACGAGGGATT CCGTAATTTT GCTTCTAGCA AGGTTACACA AG C AGTAGTT 

151 TCAGAAGAAA AAGTCATAGA AAAGCCTGTA GTCGCAGAAG TGCCTAGCCG 

201 TCCTATCGCT AAAGAGACTC TAGCTGCACA GTTTATTGAA AGTAAGCCGG 

251 TTATTGTAAC CACACCACCC GTGCCTGTTG TTAGCGAAAC CCCAGAAGTG 

301 CCTACTGTGG CAGTTCCGCC TCAGCCTGTT CGTGAGACAG TAAAAGAGGA 

351 ACAAGCTCCT TATGC TACTG TTGTAGTGAA AAAAGGAGAT TTTCTCGAAC 

401 GCATTGCGAG AGCAAATCAT ACTACCGTTG CAAAATTGAT GCAGATCAAT 

451 GATCTTAC C A CCACCCAACT TAAAATTGGT CAGGTCATCA AAGTCCCTAC 

501 GTCTCAAGAT GTCAGCAACG AAAAAACTCC TCAAACACAG ACCGCAAACC 

551 CTGAAAATTA TTATATCGTC CAAGAAGGGG ATAGCCCGTG GACAATAGCA 

601 TTGCGTAACC ATATTCGATT GGATGATTTG CTAAAAATGA ATGATCTCGA 

651 TGAATATAAA GCCCGGCGCC TTAAGCCTGG AGATCAGTTG CGCATACGTT 

701 GA 



The PSORT algorithm predicts periplasms (0.935). 

The protein was expressed in E.coli and purified as a GST-fusion product (Figure 84 A) and also in 
his-tagged form. The recombinant proteins were used to immunise mice, whose sera were used in a 
Western blot (Figure 84B) and for FACS analysis. 

These experiments show that cp7222 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 85 

The following C.pneumoniae protein (pid 4377225) was expressed <SEQ ID 169; cp7225>: 



1 MKGTPQYHFI GIGGIGMSAIi AH I LLDRGYE VSGSDLYESY TIE SLKAKGA 

51 RCFSGHDSSH VPHDAWVYS SSIAPDNVEY LTAIQRSSRL LH RAE LL S QL 

101 MEG YES ILVS GSHGKTGTSS LIRAIFQEAQ KDPSYAIGGL AANCLNGYSG 

151 SSKIFVAEAD ESDGSLKHYT PRAWITNID NEHLNNYAGN LDNLVQVIQD 

201 FSRKVTDLNK VFYNGDCPIL KGNVQGISYG YSPECQLHIV SYNQKAWQSH 

251 FSFTFLGQEY QDIELNLPGQ HNAANAAAAC GVALTFGIDI NIIRKALKKF 

301 SGVHRRLERK NISESFLFLE DYAHHPVEVA HTLRSVRDAV GLRRVIAIFQ 

3 51 PHRF SRLEEC LQTFPKAFQE ADEVILTDVY SAGESPRESI ILSDLAEQIR 

401 KSSYVHCCYV PHGDIVDYLR NYIRIHDVCV SLGAGNIYTI GEALKDFNPK 

451 KL S I GLVCGG KSCEHDISLL SAQHVSKYIS PEFYDVSYFI INRQGLWRTG 

501 KDFPHLIEET QGDSPLSSEI ASALAKVDCL FPVLHGPFGE DGTIQGFFEI 

551 IiGKPYAGPSL SLAATAMDKL. LTKRIASAVG VPWPYQPLN LCFWKRNPEL 

601 CIQNLIETFS FPMIVKTAHL GSSIGIFLVR DKEELQEKIS EAFLYDTDVF 

651 VEESRLGSRE IEVSCIGHSS SWYCMAGPNE RCGASGFIDY QEKYGFDGID 

701 CAKISFDLQI* SQESLDCVRE LAERVYRAMQ GKGSARIDFF LDEEGNYWLS 

751 EVNPIPGMTA AS PFLQAFVH AGWTQEQIVD HFIIDALHKF DKQQTIEQAF 

801 TKEQDIiVKR* 



1 ATGAAGGGAA CTCCTCAGTA TCATTTTATC GGTATCGGTG GTATAGGAAT 

51 GAGCGCTTTA GCTCATATTT TGCTTGATCG TGGCTATGAG GTCTCTGGAA 

101 GCGAC TTAT A TGAAAGCTAT ACGATCGAAA GCCTGAAAGC TAAAGGTGCG 

151 AGGTGTTTCT CAGGCCATGA TTCCTCCCAT GTTCCTCATG ATGCCGTCGT 

201 TGTTTATAGC TCAAGTATAG CCCCTGATTiA TGTAGAGTAT CTTACCGCTA 

251 TTCAAAGATC ATCACGTCTT CTTCATAGAG CAGAGCTCTT GAGTCAGCTT 

301 ATGGAGGGTT ATGAAAGCAT TCTGGTTTCA GGAAGCCATG GGAAGACAGG 

351 GACCTCATCT CTAATTCGAG CGATTTTCCA GGAAGCTCAG AAAGATCCCT 



The cp7225 nucleotide sequence <SEQ ID 



170> is: 
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401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 
2051 
2101 
2151 
2201 
2251 
2301 
2351 
2401 



CCTATGCTAT 
TCATCGAAAA 
GCACTACACT 
TGAATAATTA 
TTCTCTAGAA 
TCCTATTTTG 
AATGTCAATT 
TTTTCCTTTA 
CCCTGGACAA 
TTACCTTTGG 
TCGGGAGTTC 
TTTCTTAGAA 
GCTCTGTGCG 
CCACATCGAT 
TTTCCAAGAA 
AAAGTCCTAG 
AAGTCTTCTT 
TTATCTACGA 
C TGGAAAT AT 
AAATTATCCA 
TTCTCTACTT 
ATGATGTGAG 
AAGGATTTTC 
TTC TGAAATC 
TCCATGGCCC 
TTAGGAAAAC 
GGATAAGCTG 
TCCCTTACCA 
TGTATTCAGA 
TGCACATTTG 
AATTACAAGA 
GTGGAGGAAA 
CCATTCTTCT 
CTAGTGGGTT 
TGCGCAAAGA 
TGTTAGAGAA 
CAGCTCGAAT 
GAGGTCAATC 
TTTTGTTCAC 
TAGATGCTCT 
ACTAAAGAAC 



TGGAGGACTC 
TCTTCGTTGC 
CCCCGTGCAG 
CGCTGGGAAT 
AAGTAACAGA 
AAAGGAAATG 
GCATATCGTT 
CCTTTTTAGG 
CATAACGCTG 
CATAGACATA 
ATCGACGTCT 
GATTATGCTC 
TGATGCTGTG 
TCTCTCGTTT 
GCTGATGAAG 
AGAGTCTATC 
ATGTCCATTG 
AACTACATTC 
CTATACTATT 
TAGGACTCGT 
TCTGCTCAAC 
TTACTTCATC 
CTCATCTTAT 
GCTTCAGCTT 
ATTTGGAGAG 
CTTATGCCGG 
TTAACAAAAC 
ACCTTTAAAT 
ATCTTATAGA 
GGATCTAGTA 
AAAGATCTCA 
GTCGCTTAGG 
AGCTGGTATT 
TATTGATTAT 
TCTCTTTTGA 
CTTGCAGAGC 
AGATTTTTTC 
CTATTCCAGG 
GCAGGATGGA 
ACATAAGTTT 
AAGATTTAGT 



GCTGCAAACT 
CGAAGCCGAT 
TAGTCATTAC 
CTTGATAACC 
TCTCAATAAG 
TCCAAGGGAT 
TCCTATAATC 
CCAGGAGTAT 
CAAATGCAGC 
AACATCATTC 
AGAAAGAAAA 
ATCATCCTGT 
GGTTTGCGAA 
AGAAGAGTGC 
TCATACTTAC 
ATTCTTTCCG 
TTGTTATGTT 
GCATTCATGA 
GGAGAGGCTT 
CTGTGGAGGG 
ATGTCTCTAA 
ATAAATCGTC 
TGAAGAGACT 
TAGCAAAAGT 
GATGGTACGA 
ACCCTCACTA 
GAATTGCATC 
CTCTGTTTCT 
GACATTTTCT 
TTGGGATATT 
GAAGCATTTC 
GTCTCGTGAA 
GTATGGCAGG 
CAAGAGAAAT 
TTTACAGCTC 
GTGTCTACCG 
TTGGATGAAG 
AATGACAGCA 
CGCAAGAACA 
GATAAGCAGC 
TAAAAGATAA 



GCCTGAATGG 
GAAAGTGATG 
AAATATAGAT 
TGGTTCAGGT 
GTATTCTATA 
TTC TTATGG A 
AAAAGGCATG 
CAAGACATTG 
AGCAGCCTGT 
GAAAAGCTCT 
AATATATCCG 
AGAGGTTGCA 
GAGTCATCGC 
TTACAAACCT 
AGATGTCTAT 
ACCTTGCGGA 
CCCCATGGAG 
TGTCTGTGTT 
TAAAAGACTT 
AAATCTTGCG 
ATATATTTCT 
AGGGCTTATG 
CAAGGGGATT 
CGACTGTTTG 
TCCAGGGATT 
TCTTTAGCAG 
AGCAGTGGGT 
GGAAACGCAA 
TTCCCTATGA 
TTTAGTCCGT 
TATATGACAC 
ATCGAAGTGT 
GCCTAATGAA 
ATGGATTTGA 
TCACAAGAAT 
AGCAATGCAA 
AGGGGAATTA 
GCTAGCCCAT 
AATTGTAGAT 
AGACTATCGA 



GTATTCTGGA 
GGTCTTTAAA 
AATGAACATT 
AATCCAGGAC 
ACGGGGATTG 
TATTCAC C AG 
GCAATCTCAC 
AGCTCAATCT 
GGAGTTGCTC 
CAAAAAATTC 
AAAGCTTTCT 
CATACCCTGC 
AATTTTTCAA 
TCCCCAAAGC 
AGTGC CGGAG 
ACAGATTCGT 
ACATCGTAGA 
TCTCTAGGAG 
TAAC C CT AAA 
AACACGATAT 
CCTGAATTCT 
GAGAACAGGA 
CGCCACTTTC 
TTTCCCGTGC 
TTTTGAAATC 
CAACTGCAAT 
GTTCCTGTAG 
TCCAGAACTA 
TTGTAAAAAC 
GATAAAGAGG 
GGATGTGTTT 
CCTGTATCGG 
CGCTGTGGTG 
TGGCATAGAT 
CTTTAGATTG 
GGAAAAGGTT 
TTGGTTGTCA 
TTTTACAAGC 
CACTTTATTA 
ACAGGCATTC 



The PSORT algorithm predicts inner membrane (0.16). 

The protein was expressed in Kcoli and purified as a his-tag product (Figure 85A). The recombinant 
protein was used to immunise mice, whose sera were used in a Western blot (Figure 85B) and for 
FACS analysis. 

These experiments show that cp7225 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 86 

The following Cpneumoniae protein (pid 4377248) was expressed <SEQ ID 171; cp7248>: 

1 MKFWLQGCAF VGCLLLTLPC CAARRRASGB NLQQTRPIAA ANLQWESYAE 

51 ALEHSKQDHK PICLFFTGSD WCMWCIKMQD QILQSSEFKH FAGVHLHMVE 

101 VDFPQKNHQP EEQRQKNQEL KAQYKVTGFP ELVFIDAEGK QIARMGFEPG 

151 GGAAYVSKVK SALKLR* 

A predicted signal peptide is highlighted. 

The cp7248 nucleotide sequence <SEQ ID 172> is: 

1 ATGAAATTTT GGTTGCAAGG ATGTGCTTTT GTCGGTTGTC TGCTATTGAC 
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51 
101 
151 
201 
251 
301 
351 
401 
451 
501 



TTTACCTTGT 
AAACTCGTCC 
GCTCTTGAAC 
AGGATCAGAC 
AAAGCTCTGA 
GTTGATTTCC 
TCAAGAACTG 
TCATAGATGC 
GGTGGAGCTG 
A 



TGTGCTGCAC 
TATAGCAGCT 
ATTCTAAACA 
TGGTGTATGT 
GTTTAAGCAT 
CCCAAAAGAA 
AAAGCTCAAT 
AGAAGGAAAA 
CTTACGTAAG 



GAAGACGTGC 
GCAAATCTAC 
AGATCACAAA 
GGTGCATAAA 
TTTGCGGGTG 
TCATCAACCT 
ATAAAGTTAC 
CAGCTTGCTC 
CAAGGTGAAG 



TTCTGGAGAA 
AATGGGAGAG 
CCTATTTGTC 
AATGCAAGAC 
TGCATCTGCA 
GAAGAGCAGC 
AGGATTCCCC 
GCATGGGATT 
TCTGCTCTTA 



AATTTGCAAC 
CTATGCAGAA 
TTTTCTTTAC 
CAGATTTTGC 
TATGGTTGAA 
GCCAAAAAAA 
GAACTGGTCT 
TGAGCCTGGT 
AACTACGTTA 



The PSORT algorithm predicts periplasmic (0.932). 

The protein was expressed in E.coli and purified as a GST-fusion product (Figure 86 A) and also in 
his-tagged form. The recombinant proteins were used to immunise mice, whose sera were used in a 
Western blot (Figure 86B) and for FACS analysis. 

The cp7248 protein was also identified in the 2D-PAGE experiment. 

These experiments show that cp7248 is a surface-exposed and immunoaccessible protein, and that it 
. is a useful immunogen. These properties are not evident from the sequence alone. 

Example 87 

The following ^pneumoniae protein (pid 4377249) was expressed <SEQ ID 173; cp7249>: 



1 MXPSPTPINF RDDT I LETDP KPSLIMFSSK 

51 TIWNIVKFII SIILFLPIAL LWVLKKTCQF 

101 RMTF Jj SH I KQ LLSLKEISAA DRWIQYDDL 

151 S QGN SGLMEN LFDRGDSSLH QLAKATGSNL 

201 VK S YQACVRY LRDEETGPKA NQIIAFGYSL 

251 SWIWKDRGP RSLADVANQI CKPIASAIIK 

301 FIYNSNHDQE LISDGLFERE NCVAT PFLEL 

351 NPLSPNWDR LAAVISNYLD SENRKSQQPD 

The cp7249 nucleotide sequence <SEQ ID 174> is: 



1 


ATGATCCCAT 


51 


GACGGATCCA 


101 


TAGCTTCTGA 


151 


ACGATTTGGA 


201 


CTTAGCGTTA 


251 


CATCTTCTAT 


301 


CGAATGACCT 


351 


CTCAGCTGCC 


401 


GCTTAGCTAT 


451 


TCTCAAGGAA 


501 


CTCTCTACAC 


551 


ACTATCCTGG 


601 


GTTAAATCGT 


651 


TCCTAAAGCC 


701 


TCCAAGCTGC 


751 


TCATGGATTG 


801 


GAATCAAATT 


851 


GGAACATAGA 


901 


TTCATTTACA 


951 


CGAAAGAGAA 


1001 


AAACCTCGGG 


1051 


AATCCTCTCA 


1101 


TTATTTAGAT 



CCCCTACCCC 
AAGCCGTCTT 
AAGACGGAAG 
ATATTGTGAA 
TTGTGGGTAC 
CATATCTCAG 
TTCTGTCCCA 
GATCGTGTGG 
AAAGATACCT 
ACTCTGGATT 
CAGCTAGCCA 
AATTATGTCC 
ATCAGGCATG 
AATCAAATCA 
TGCTCTAGAT 
TTGTAAAAGA 
TGTAAGCCCA 
CTCTGTGAAA 
ACTCTAATCA 
AATTGCGTAG 
GACTAAAATT 
GTCCAAATGT 
TCTGAAAACA 



AATAAACTTT 
TAATCATGTT 
GCCCATCCCA 
GTTTATTATC 
TCAAGAAAAC 
AGCATGTCAA 
TATTAAACAA 
TTATACAATA 
CATGCTCTTC 
GATGGAAAAC 
AAGCAACCGG 
AGCAAAGGAG 
CGTACGCTAC 
TAGCTTTCGG 
CGTGAGGTCA 
TCGGGGCCCT 
TAGCTTCCGC 
CCTAGCGAAA 
TGATCAAGAA 
CAACACCTTT 
CCTATACCCG 
AGTAGACAGA 
GAAAGTCTCA 



KTE IASERRK 
FILPSSIISQ 
WDSLAIKIP 
LVFNYPG IMS 
GTSVQAAALD 
LVGWNIDSVK 
PEVKTSGTKI 



CGTGATGATA 
CTCTTCAAAA 
CCTTATTTAA 
TCAATCATTC 
CTGTCAGTTT 
AAACAGCTGT 
CTCCTAAGCC 
TGACGATTTG 
CCCACAGGTG 
CTGTTCGATC 
CTCGAATCTT 
AAGCGAAACG 
CTACGAGATG 
AT AC TCTTTG 
CTGATGGCAG 
CGCTCTCTAG 
GATTATAAAA 
GATTGCGTTG 
CTCATTAGCG 
TCTAGAGCTT 
AAAGGGATCT 
TTAGCAGCAG 
GCAACCTGAT 



AHPTLFKVLG 
SMSKTAVAIR 
HALPHRWIliY 
SKGEAKRENL 
REVTDGSDGT 
PSERIiRCPEI 
PIPERDLLHL 



CGATTCTAGA 
AAAACAGAGA 
AGTTCTAGGA 
TGTTCCTTCC 
TTCATTCTCC 
GGCAATTCGG 
TTAAGGAAAT 
GTGGTTGATA 
GATTCTTTAT 
GGGGCGATTC 
CTTGTGTTCA 
AGAAAATCTG 
AAGAGACAGG 
GGAACTAGTG 
TGATGGAACT 
CAGATGTCGC 
CTCGTTGGTT 
TCCCGAAATT 
ACGGCCTCTT 
CCTGAAGTAA 
TCTC CATC T A 
TGATCTCTAA 
TAA 



The PSORT algorithm predicts inner membrane (0.571). 
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The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 87 A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
87B) and for FACS analysis. 

These experiments show that cp7249 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 88 

The following C. pneumoniae protein (pid 4377261) was expressed <SEQ ID 175; cp7261>: 

1 MLPISIkLFY VILGCLSAYI ADKKKRNVIG WFFAGAFFGF IGLWLLLLP 

51 SRRNALEKPQ NDPFDNSDLF DDLKKSIiAGN DEIPSSGDLQ E I VI DTEKWF 

101 YLNKDRENVG PISFEELWIj LKGKTYPEEI WVWKKGMKDW QRVKDVPSLQ 

151 QALKEASK* 

The cp7261 nucleotide sequence <SEQ ID 176> is: 

1 ATGCTCCCTA TTTCGATTTT ATTATTTTAT GTGATTCTAG GTTGTCTATC 

51 TGCCTACATA GCAGATAAGA AAAAACGAAA TGTTATTGGC TGGTTTTTTG 

101 CAGGAGCATT TTTTGGATTT ATTGGTCTAG TTGTCCTTCT TCTTCTTCCT 

151 TCTCGTCGAA ACGCTTTAGA AAAGCCACAA AACGATCCTT TTGATAACTC 

201 CGATCTTTTT GATGATTTGA AAAAAAGTTT AGCAGGTAAT GACGAGATAC 

251 CCTCATCGGG AGATCTTCAA GAAATCGTTA TCGATACAGA GAAGTGGTTT 

301 TATTTAAATA AAGATAGAGA AAACGTAGGT CCGATATCTT TTGAGGAGTT 

351 - GGTCGTACTT TTAAAGGGAA AAACGTATCC AGAAGAAATT TGGGTATGGA 

401 AAAAGGGAAT GAAAGATTGG CAACGAGTGA AGGATGTTCC ATCACTACAA 

451 CAGGCTTTGA AAGAAGCATC AAAATAA 

The PSORT algorithm predicts inner membrane (0.848). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 88A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
88B) and for FACS analysis. 

These experiments show that cp7261 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 89 

The following C.pneumoniae protein (pid 4377305) was expressed <SEQ ID 177; cp7305>: 

1 MEVYSFHPAV RTSFQHRVMA ALDAWFFLGG HKLKWSLDS CNSGWAYQEL 

51 VSISTTEKVL KLLSYLLVPI VIIALLIRCL LHSNFRIDVE KERWLKIREL 

101 GIDIESCKLP SSYVNQVSSF IWFEKDKSKR PRIDVDYHTL HSKDWWFPI 

151 VFQKIPKTSR FSYWFSQKET RKRDYVRNML DHVIGYLTSE GGEWLQYISK 

201 TSYQSATSLD PERVLQYCLT DNQELQGEVQ RLLNEESATK SSGDKEVIiLS 

251 HVSDIICQCW WPKFLEVIQS PAFIEELVEE VSGKIiNLDFL CLEKANTLDQ 

301 ELRNSLLRAV VHHG SEGVDI KKVGAGLIIY TEAIQLQIPF SRS* 

The cp7305 nucleotide sequence <SEQ ID 178> is: 

1 ATGGAAGTTT ATAGTTTTCA CCCTGCGGTA AGGACTTCGT TTCAGCACCG 

51 TGTAATGGCA GCACTAGATG CTTGGTTTTT TCTAGGAGGG CACCGTTTAA 

101 AAGTAGTTTC TCTAGATAGT TGTAACTCAG GTTGGGCGTA TCAAGAACTT 

151 GTGTCTATTT CAACGACAGA AAAAGTCTTG AAACTACTCT CTTACCTACT 

201 CGT AC CGATT GTCATAATAG CTCTGTTAAT TCGTTGTCTT TTACATAGCA 

251 ATTTTAGGAT AGACGTAGAG AAGGAACGTT GGTTAAAAAT AAGGGAGTTA 

301 GGAATTGATA TAGAAAGCTG CAAACTCCCC AGTTCTTATG TAAACCAGGT 

351 TTC CTCGTTT ATTTGGTTTG AAAAAGATAA ATCC AAACGG CCACGTATTG 

401 ATGTAGATTA TCATACGCTA CATAGCAAAG ACTGGGTAGT TTTCCCTATC 



WO 02/02606 PCT/1B01/01445 

-127- 

451 GTTTTTCAGA AAATTCCAAA GACCTCGCGT TTCAGTTATT GGTTCTCACA 

501 AAAAGAAACA AGGAAGAGGG ATTATGTGAG AAATATGCTG GACCACGTCA 

551 TTGGTTATCT AACGTCAGAA GGTGGGGAGT GGTTGCAGTA TATATCGAAA 

601 ACCTCTTATC AAAGCGCTAC TTCCTTGGAT CCTGAAAGAG TTCTTCAATA 

5 651 TTGCTTAACT GATAACCAGG AGCTCCAGGG AGAAGTGCAA CGTTTGCTTA 

701 ATGAGGAGAG TGCGACCAAA AGCTCTGGGG ATAAGGAAGT TTTGTTAAGT 

751 CATGTATCTG ACATTATTTG CCAGTGTTGG TGGCCAAAGT TTCTTGAAGT 

801 TATACAATCT CCGGCCTTTA TTGAAGAATT AGTAGAAGAA GTGAGTGGTA 

851 AACTTAATTT AGATTTTTTA TGCCTAGAAA AGGCTAATAC ATTAGATCAG 

10 901 GAGTTGAGAA AC AG T C TTCT AAGAGCAGTC GTACACCACG GTTCTGAAGG 

951 AGTTGATATT AAGAAAGTTG GTGCCGGCCT CATTATTTAT ACGGAAGCTA 

1001 TTCAATTACA GATTCCCTTC TCAAGGAGTT AA 

The PSORT algorithm predicts inner membrane (0.508). 

The protein was expressed in E.coli and purified as a GST-fusion product (Figure 89 A) and also as a 
15 double GST/his fusion. The recombinant proteins were used to immunise mice, whose sera were 
used in a Western blot (Figure 89B) and for FACS analysis. 

These experiments show that cp7305 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 90 

20 The following C.pneumoniae protein (pid 43 773 47) was expressed <SEQ ID 179; cp7347>: 

1 MKKGKIiGA IV FGLLFTSSVA GFSKDLTKDN AYQDLNVIEH LISLKYAPLP 

51 WKELLFGWDL SQQTQQARLQ LVIiEEKPTTN YCQKVIiSNYV RSLNDYHAGI 

101 TFYRTESAYI PYVLKLSEDG HVFWDVQT S QGDIYLGDEI LEVDGMGIRE 

151 AIESLRFGRG SATDYSAAVR SLTSRSAAFG DAVPSGIAML KLRRPSGLIR 

25 201 STPVRWRYTP EHIGDFSLVA PLIPEHKPQL PTQSCVLFRS GVNSQSSSSS 

251 LFS SYMVPYF WEE LRVQNKQ RFDSNHHIGS RNGFLPTFGP ILWEQDKGPY 

301 RSYIFKAKDS QGNPHRIGFL RISSYVWTDL EGLEEDHKDS PWELFGEIID 

351 HLEKETDAIiI IDQTHNPGGS VFYLYSLLSM LTDHPLDTPK HRMIFTQDEV 

401 S S AIjHWQDUj EDVFTDEQAV AVLGETMEGY CMDMHAVASL QNFSQSVLSS 

30 451 WVSGDINLSK PMPLLGFAQV RPHPKHQYTK PLFMLIDEDD FSCGDLAPAI 

501 IiKDNGRATLI GKPTAGAGGF VFQVTFPNRS GIKGLSLTGS L AVRKDGEF I 

551 ENLGVAPHID LGFTSRDLQT SRFTDYVEAV KTIVLTSLSE NAKKSEEQTS 

601 PQETPEVIRV SYPTTTSAS* 

A predicted signal peptide is highlighted. 
35 The cp7347 nucleotide sequence <SEQ ID 180> is: 

1 ATGAAAAAAG GGAAATTAGG AGCCATAGTT TTTGGC CTTC TATTTACAAG 

51 TAGTGTTGCT GGTTTTTCTA AGGATTTGAC TAAAGACAAC GCTTATCAAG 

101 ATTTAAATGT CATAGAGCAT TTAATATCGT TAAAATATGC TCCTTTACCA 

151 TGGAAGGAAC TATTATTTGG TTGGGATTTA TCTCAGCAAA CACAGCAAGC 

40 201 TCGCTTGCAA CTGGTCTTAG AAGAAAAACC AACAACCAAC TACTGCCAGA 

251 AGGTACTCTC TAACTACGTG AGATCATTAA ACGATTATCA TGCAGGGATT 

301 ACGTTTTATC GTACTGAAAG TGCGTATATC CCTTACGTAT TGAAGTTAAG 

351 TGAAGATGGT CATGTCTTTG TAGTCGACGT ACAGACTAGC CAAGGGGATA 

401 TTTACTTAGG GGATGAAATC CTTGAAGTAG ATGGAATGGG GATTCGTGAG 

45 451 GCTATCGAAA GCCTTCGCTT TGGACGAGGG AGTGCCACAG ACTATTCTGC 

501 TGCAGTTCGT TCCTTGACAT CGCGTTCCGC CGCTTTTGGA GATGCGGTTC 

551 CTTCAGGAAT TGCCATGTTG AAACTTCGCC GACCCAGTGG TTTGATCCGT 

601 TCGACACCGG TCCGTTGGCG TTATACTCCA GAGCAT ATC G GAGATTTTTC 

651 TTTAGTTGCT CCTTTGATTC CTGAACATAA ACCTCAATTA CCTACACAAA 

50 701 GTTGTGTGCT ATTCCGTTCC GGGGTAAATT CACAGTCTTC TAGTAGCTCT 

751 TTATTCAGTT CCTACATGGT GCCTTATTTC TGGGAAGAAT TGCGGGTTCA 

801 AAATAAGCAG CGTTTTGACA GTAATCACCA TATAGGGAGC CGTAATGGAT 

851 TTTTACCTAC GTTTGGTCCT ATTCTTTGGG AACAAGACAA GGGGCCCTAT 

901 CGTTCCTATA TCTTTAAAGC AAAAGATTCT CAGGGCAATC CCCATCGCAT 

55 951 AGGATTTTTA AGAATTTCTT CTTATGTTTG GACTGATTTA GAAGGACTTG 

1001 AAG AG G ATC A TAAGGATAGT CCTTGGGAGC TCTTTGGAGA GATCATCGAT 
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1051 CATTTGGAAA AAGAGACTGA TGCTTTGATT ATTGATCAGA CCCATAATCC 

1101 TGGAGGCAGT GTTTTCTATC TCTATTCGTT ACTATCTATG TTAACAGATC 

1151 ATCCTTTAGA TACTCCTAAA CATAGAATGA TTTTCACTCA GGATGAAGTC 

1201 AGCTCGGCTT TGCACTGGCA AGATCTACTA GAAGATGTCT TCACAGATGA 

1251 GCAGGCAGTT GCCGTGCTAG GGGAAACTAT GGAAGGATAT TGCATGGATA 

1301 TGCATGC TGT AGCCTCTCTT CAAAACTTCT CTCAGAGTGT CCTTTCTTCC 

13 51 TGGGTTTCAG GTGATATTAA CCTTTCAAAA CCTATGCCTT TGCTAGGATT 

1401 TGCACAGGTT CGACCTCATC CTAAACATCA ATATACTAAA CCTTTGTTTA 

1451 TGTTGATAGA CGAGGATGAC TTCTCTTGTG GAGATTTAGC GCCTGCAATT 

1501 TTGAAGGATA ATGGCCGCGC TACTCTCATT GGAAAGCCAA CAGCAGGAGC 

1551 TGGAGGTTTT GTATTCCAAG TCACTTTCCC TAACCGTTCT GGAATTAAAG 

1601 GTCTTTCTTT AACAGGATCT TTAGCTGTTA GGAAAGATGG TGAGTTTATT 

1651 GAAAACTTAG GAGTGGCTCC TCATATTGAT TTAGGATTTA CCTCCAGGGA 

1701 TTTGCAAACT TCCAGGTTTA CTGATTACGT TGAGGCAGTG AAAACTATAG 

1751 TTTTAACTTC TTTGTCTGAG AACGCTAAGA AGAGTGAAGA GCAGACTTCT 

1801 CCGCAAGAGA CGCCTGAAGT TATTCGAGTC TCTTATCCCA CAACGACTTC 

1851 TGCTTCGTAA 

The PSORT algorithm predicts periplasmic space (0.2497). 

The protein was expressed in E.coli and purified as a GST-fusion product (Figure 90 A) and also in 
his-tagged form. The recombinant proteins were used to immunise mice, whose sera were used in a 
Western blot (Figure 90B) and for FACS analysis. 

These experiments show that cp7347 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 91 

The following C.pneumoniae protein (pid 4377353) was expressed <SEQ ID 181; cp7353>: 

1 MNMPVPSAVP SANITLKEDS STVSTASGIL KTATGEVLVS CTALEGSSST 

51 DALISLALGQ IILATQQELL LQSTNVHQLL FLPPEWELE IQWDLLVQL 

101 EHAETITSEP QETQTQSRSE QTtiPQQSSSK QSALSPRSI*K PEISDSKQQQ 

151 ALQTPKDSAV RKHSEAPSPE TQARASLSQA SSSSQRSLPP QESAPERTLL 

201 EQQKASSFSP LSQFSAEKQK EALTTSKSHE LYKERDQDRQ QRE QHDRKHD 

251 QEEDAESKKK KKKRGLGVKA VAEEPGENLD IAALIFSDQM RPPAEETSKK 

301 ETTFKKKIiPS. PMSVFSRFIP SKNPLSVGSS IHGPIQTPKV ENVFLRFMKL 

351 MARILGQAEA EANELYMRVK QRTDDVDTLT VL.ISKINNEK KDIDWSENEE 

401 MKALLNRAKE IGVT I DKEKY TWTEEEKRLL KENVQMRKEN MEKITQMERT 

451 DMQRHLQEIS QCHQARSNVL KXiLKELMDTF IYNLRP* 

The cp7353 nucleotide sequence <SEQ ID 182> is: 

1 ATGAATATGC CTGTTCCTTC TGCAGTTCCC TCTGCAAATA TAACTCTAAA 

51 AGAAGACAGC TCAACAGTTT CCACAGCCTC TGGAAT AT T A AAGACTGCAA 

101 CAGGTGAAGT CTTAGTCTCT TGTACAGCGC TAGAAGGAAG CTCTTCTACA 

151 GATGCTTTAA TTAGCTTAGC TTTAGGACAA ATCATTCTTG CGACCCAACA 

2 01 AGAACTGCTC TTACAAAGCA CAAATGTTCA TCAACTCCTC TTCCTCCCTC 

251 CTGAAGTTGT AGAATTAGAA ATCCAAGTTG TTGACTTGCT AGTGCAATTG 

301 GAACATGCAG AGACAATCAC AAGTGAACCA CAAGAAACAC AAACGCAAAG 

351 TAGGAGTGAG CAGACCCTCC CTCAACAAAG CAGCAGTAAA CAATCTGCTC 

401 TCTCCCCACG CTCCTTAAAA CCTGAAATTT CTGATTCTAA ACAACAGCAA 

451 GCTCTTCAAA CACCAAAAGA CTCTGCTGTA AGAAAACACA GCGAAGCACC 

501 GTCACCTGAG ACACAAGCTC GCGCTTCCTT ATCTCAGGCA AGCTCAAGTT 

551 CTCAGAGATC CTTACCTCCG CAAGAAAGTG CGCCAGAAAG AACACTATTA 

601 GAACAACAAA AAGCAAGCTC CTTCTCTCCT CTATCCCAGT TCTCTGCAGA 

651 GAAACAAAAA GAGGCCCTGA CGACCTCAAA ATCTCATGAA CTCTATAAAG 

701 AACGCGATCA AGATCG CCAA CAAAGAGAGC AGCACGACAG AAAGCACGAT 

751 CAGGAAGAAG ACGCTGAATC TAAAAAGAAA AAGAAGAAAC GTGGTCTCGG 

801 TGTAGAGGCA GTCGCTGAGG AACCCGGAGA AAATCTAGAT ATTGCCGCTT 

851 TAATCTTCTC AGATCAAATG CGACCTCCTG CTGAAGAAAC TTCTAAAAAA 

901 GAAACGACAT TCAAAAAGAA GCTACCTTCT CCAATGTCTG TGTTTAGCAG 

951 ATTCATCCCT AGTAAGAATC CGTTATCTGT AGGCTCTTCA ATACACGGGC 

1001 CTATACAAAC TCCAAAAGTA GAAAATGTGT TCTTAAGGTT CATGAAGCTC 
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1051 ATGGCAAGAA TCTTAGGCCA AGCCGAAGCC GAAGCTAATG AACTCTACAT 

1101 GCGAGTCAAA CAACGTACCG ATGATGTAGA CACACTCACA GTCCTTATCT 

1151 CTAAGATCAA TAATGAAAAG AAAGACATTG ATTGG AG TG A AAATGAAGAG 

1201 ATGAAAGCTC TTTTAAATCG AGCTAAAGAG ATTGGAGTCA CTATAGACAA 

1251 AGAAAAATAT ACTTGGACAG AAGAGGAAAA AAGACTTCTA AAAGAGAATG 

1301 TCCAAATGCG CAAAGAGAAT ATGGAGAAAA TCACTCAAAT GGAAAGGACG 

1351 GACATGCAAA GGCACCTCCA AGAGATTTCT CAATGTCATC AAGCGCGCTC 

1401 TAATGTATTG AAGTTATTGA AAGAACTTAT GGACACCTTC ATTTACAACC 

1451 TACGCCCCTA A 



The PSORT algorithm predicts cytoplasm (0.1308). 

The protein was expressed in E.coli and purified as a GST-fusion product (Figure 91A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
9 IB) and for FACS analysis. 

These experiments show that cp7353 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 92 

The following C.pneumoniae protein (pid 4377408) was expressed <SEQ ID 183; cp7408>: 



1 MLK I QKKRMC VSWITVGAI VGFFNSADAA PKKKKIPIQI LYSFTKVSSY 

51 LiKNEDASTIF CVDVDRGLLQ HRYLGSPGWQ ETRRRQLFKS LENQSYGNER 

101 LGEETLAIDI FRNKECLESE IPEQMEAILA NSSALVLGI S SFGITGIPAT 

151 LHSLLRQNLS FQKRSIASES FLLKIDSAPS DASVFYKGVL FRGETAIVDA 

201 LSQLFAQLDL SPKK1 IFIjGE DPEWQAVGS ACIGWGMNFli GLVYYPAQES 

251 LFSYVHPYST ATELQEAQGIi QVISDEVAQL TliNALPKMN* 



1 ATGTTGAAAA TCCAGAAAAA AAGAATGTGT GTCAGCGTAG TCATCACGGT 

51 AGGCGCCATA GTGGGGTTTT TCAATTCTGC AGACGCAGCA CCAAAGAAAA 

101 AGAAGATCCC TATACAGATT CTCTACTCCT TTACTAAAGT CTCTTCCTAT 

151 TTAAAAAACG AAGACGCAAG TACTATATTT TGCGTCGATG TGGATC GTGG 

201 ACTTCTCCAG CATCGGTATT TAGGTAGTCC AGGATGGCAG G AAAC C AG AC 

251 GTCGGCAGTT ATTTAAATCC TTAGAAAATC AATCATACGG CAACGAACGT 

301 TTAGGAGAAG AAACTCTTGC TATTGATATT TTCAGGAACA AAGAGTGCTT 

351 GGAGAGCGAG ATCCCAGAGC AGATGGAAGC TATCCTTGCA AATTCCTCGG 

401 CCTTGGTCTT AGGCATCTCT TCTTTTGGGA TCACAGGAAT TCCTGCGACT 

451 TTGCATAGTT TGCTTCGACA GAATCTATCT TTCCAAAAAC GCTCTATAGC 

501 ATCGGAGAGC TTCCTTTTAA AGATCGATAG TGCCCCCTCA GATGCCTCTG 

551 TTTTTTATAA AGGCGTGCTT TTCCGCGGAG AGACTGCGAT CGTGGATGCG 

601 TTAAGCCAAT TATTTGCCCA GCTCGATCTT TCTCCTAAAA AAATTATCTT 

651 TCTAGGAGAA GACCCTGAGG TCGTTCAAGC TGTTGGGTCT GCTTGTATAG 

701 GTTGGGGCAT GAACTTTTTA GGCCTGGTAT ACTATCCTGC TCAAGAAAGC 

751 CTTTTTTCTT ATGTTCATCC TTACTCTACA GCAACGGAGC TCCAAGAAGC 

801 ACAGGGTTTA CAAGTAATTT CAGATGAAGT CGCACAGCTT ACTTTAAACG 

851 CTCTTCCGAA AATGAATTAA 



The PSORT algorithm predicts inner membrane (0.123). 

The protein was expressed in E.coli and purified as a his-tag product (Figure 92A). The recombinant 
protein was used to immunise mice, whose sera were used in a Western blot (Figure 92B) and for 
FACS analysis. 



The cp7408 nucleotide sequence <SEQ ID 



184> is: 



These experiments show that cp7408 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 



WO 02/02606 



PCT/IB01/01445 



-130- 

Example 93 

The following C.pnewnoniae protein (pid 4376424) was expressed <SEQ ID 185; cp6424>: 

1 MMHNIWLSE EPGRSAFLGR TAFFPNKYPI AQGGVGIPST IGNLFTIWYC 
51 FYFYRAATPQ SDHPDGCGFI LLERLKELGA GFFYCDLRES NTTGFTLFFE 
101 GSNKGVLKNH LFIRDE* 

The cp6424 nucleotide sequence <SEQ ID 186> is: 

1 ATGATGCACA ATATTGTTGT TCTTAGTGAG GAACCTGGAC GAAGCGCTTT 
51 TCTTGGTAGG ACGGCATTTT TCCCTAATAA GTATCCAATA GCTCAGGGTG 
101 GTGTTGGAAT ACCATCTACA ATAGGCAATC TCTTTACTAT ATGGTAC TGT 
151 TTCTATTTTT ATAGAGCTGC AACTCCACAA TCTGATCATC CTGACGGATG 
201 TGGCTTTATT C T AC TAG AAA GGCTTAAGGA GCTCGGTGCA GGGTTCTTTT 
251 ATTGTGATCT TCGTGAGTCC AATACCACTG GCTTTACTCT TTTTTTTGAA 
301 GGCTCCAATA AAGGTGTGTT AAAGAATCAC TTGTTTATTA GAGATGAGTA 
351 A 

The PSORT algorithm predicts cytoplasm (0.2502). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 93 A) and also in 
his-tagged form. The recombinant proteins were used to immunise mice, whose sera were used in 
Western blots (Figure 93B) and for FACS analyses (Figure 93C; GST-fusion). 

These experiments show that cp6424 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 94 

The following C.pneumoniae protein (pid 4376449) was expressed <SEQ ID 187; cp6449>: 

1 VASETYPSQI LHAQREVRDA YFNQADCHPA RANQILEAKK ICLLDVYHTN 

51 HYSVFTFCVD NYPNLRFTFV SSKNNEMNGL SNPLDNVL.VE AMVRRTHARN 

101 LLAACKIRNI EVPRWGLDIi RSGIL1SKLE LKQPQFQSLT EDFVNHSTNQ 

151 EEARVHQKHV LLISLILIiCK QAVLESFQEK KRSS* 

The cp6449 nucleotide sequence <SEQ ID 188> is: 

1 GTGGCGTCTG AAACGTATCC TTCTCAGATA TTGCACGCTC AGAGGGAAGT 
51 ACGTGATGCC TATTTTAATC AAGCGGATTG CCATCCTGCT CGGGCTAATC 
101 AGATTCTCGA GGCTAAGAAA ATCTGTTTAT TAGATGTTTA TCATACTAAT 
151 CATTATTCCG TATTTACTTT TTGTGTAGAT AATTATCCGA ATCTCCGCTT 
2 01 TACATTTGTA TCTTCAAAAA ACAATGAGAT GAATGGCTTA TCTAATCCTC 
251 TAGATAATGT TCTTGTAGAG GCTATGGTAC GTAGAACACA TGCAAGAAAC 
301 CTACTTGCAG CGTGTAAAAT TCGAAATATT GAGGTTCCAA GGGTTGTTGG 
351 GCTTGACCTA AGATCTGGGA TACTCATTTC GAAACTAGAA TTGAAGCAAC 
401 CTCAGTTCCA AAGTTTAACA GAAGACTTCG TAAATCATTC CACAAATCAG 
451 GAAGAAGCTC GCGTCCATC A AAAGCATGTG TTGCTAATTT CTTTAATTTT 
501 ACTTTGCAAG CAGGCCGTTC TGGAATCATT CCAGGAAAAA AAGCGATCCT 
551 CTTAA 

The PSORT algorithm predicts inner membrane (0.2084). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 94 A) and also in 
his-tagged form. The recombinant proteins were used to immunise mice, whose sera were used in 
Western blots (Figure 94B) and for FACS analyses (Figure 94C; GST-fusion). 

These experiments show that cp6449 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 
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Example 95 

The following C.pneumoniae protein (pid 4376495) was expressed <SEQ ID 189; cp6495>: 

MREIiNAFELTQPEEYRNRWVLMPCLKCRFCRTQHAKVWSYRCVHEASLYEK^ 
LRKMISPHKIRYFECGAYGTKLQRPHYHIililiS 

The cp6495 nucleotide sequence <SEQ ID 190> is: 

TTGCG AG AATT AAATGC TTTTG AATT AACTCAAC CT GAAGAGT ATC G AAAC C GT TG GGTT T TG ATG C C TTGTCT T AAG TG T 

CGTTTTTGT AG AACGC AAC ATGC AAAAGTCTGGTCTT ATCGTTGTGTC C ATG AAGC TTC TTTGT ATGAGAAAAATTGTTTT 

CTTACTTTGACTTATGATGATAAGCATTTACCTCAGTATGGTTCGTTGGTAAAGCTGCATTTACAGCTGTTTCTTAAGAGA 

TTAAGAAAGATGATTTCTCCTCATAAAATTCGTTATTTTGAATGTGGTGCGTATGGAA 

CATCTACTTTTATCATGA 

The PSORT algorithm predicts cytoplasmic (0.280). 

The protein was expressed in E.coli and purified as a GST-fusion product (Figure 95A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
95B) and for FACS analysis (Figure 95C). 

These experiments show that cp6495 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 96 

The following C.pneumoniae protein (pid 4376506) was expressed <SEQ ID 191; cp6506>: 

1 MRRFLFI* I L S SLPLVAFSAD NFTILEEKQS PLSRVSIIFA LPGVTPVSFD 

51 GNCPIPWFSH SKKTLEGQRI YYSGDSFGKY FWSALWPNK VSSAWACNM 

101 IliKHRVDLIL IIGSCYSRSQ DSRFGSVLVS KGYINYDADV RPFFERFEIP 

151 DIKKSVFATS EVHREAILRG GEEFISTHKQ EIEELLKTHG YLKSTTKTEH 

201 TIjMEGLVATG ESFAMSRNYF LSIiQKIiYPEI HGFDSVSGAV SQVCYEYSIP 

251 CLGVNILIiPH PLESRSNEDW KHLQSEASKI YMDTLLKSVL KEIiCSSH* 

The cp6506 nucleotide sequence <SEQ ID 192> is: 

1 ATGCGTCGTT TTCTGTTTCT TATTCTTAGC TCTCTTCCTT TGGTCGCATT 

51 CTCTGCTGAT AATTTCACTA TTCTAGAAGA AAAACAGAGT CC TTTAAGTC 

101 GTGTAAGTAT TATTTTTGCT TTACC TGGGG TTACTCCCGT TTCTTTTGAT 

151 GGTAATTGTC CTATTCCTTG GTTTTCTCAT AGTAAAAAGA CTCTAGAGGG 

201 ACAGAGAATT T ATT AC TC TG GCGACTCCTT TGGGAAATAC TTTGTAGTTT 

251 CTGCTCTTTG GCCTAATAAA GTTTC TTCAG CTGTTGTGGC TTGTAATATG 

301 AT TC TT AAAC ATCGAGTGGA TCTTATTCTA ATTATAGGCT CGTGTTACTC 

3 51 TAGGTCTCAA GATAGCCGTT TTGGCAGCGT CTTAGTTTCT AAAGGCTACA 

401 TTAATTATGA TGCAGATGTG AGGCCTTTCT TTGAAAGATT TGAGATTCCA 

451 GACATTAAAA AGAGTGTTTT TGCAACCAGT G AGGTT C ATC GGGAGGCAAT 

501 TCTTCGTGGA GGCGAAGAGT TTATTTCTAC CCATAAACAA GAAATCGAAG 

551 AGCTTTTGAA G AC TC ATGGG TATTT GAAAT CAACAACCAA AACGGAGCAC 

601 ACCTTAATGG AAGGTTTGGT TGCTACAGGC GAGTCTTTCG CGATGTCGCG 

651 AAACTATTTT CTTTCCTTAC AAAAATTGTA TC C AG AG ATT CATGGTTTTG 

701 ATAGTGTCAG CGGCGCTGTT TC TCAGGT AT GCTATGAATA TAGCATTCCT 

751 TGTTTAGGTG TGAATATCCT TCTCCCTCAT CCTTTAGAAT CACGGAGTAA 

801 CGAGGATTGG AAGCATCTTC AAAGTGAGGC AAGTAAAATT TATATGGATA 

851 CCTTGCTCAA GAGTGTATTA AAAGAACTCT GTTCTTCTCA TTAA 

The PSORT algorithm predicts periplasmic space (0.571). 

The protein was expressed in E.coli and purified as his-tag (Figure 96A) and GST-fusion (Figure 
96B) products. The GST-fusion protein was used to immunise mice, whose sera were used in a 
Western blot (Figure 96C) and for FACS analysis (Figure 96D). 
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These experiments show that cp6506 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 97 

The following C.pneumoniae protein (pid 4376882) was expressed <SEQ ID 193; cp6882>: 



1 ATGTCCTTAT TGAACCTTCC CTCAAGCCAG GATTCTGCAT CTGAGGACTC 

51 CACATCGCAA TCTCAAATCT TCGATCCCAT TAGAAATCGG GAGTTAGTTT 

101 CTACTCCCGA AGAAAAAGTC CGCCAAAGGT TGCTCTCCTT CCTAATGCAT 

151 AAGCTGAACT ACCCTAAGAA ACTCATCATC ATAGAAAAAG AACTCAAAAC 

201 TCTTTTTCCT CTGCTTATGC GTAAAGGAAC CCTAATCCCA AAACGCCGCC 

251 CAGATATTCT CATCATCACT CCCCCCACAT ACACAGACGC ACAGGGAAAC 

301 ACTCACAACC TAGGCGACCC AAAACCCCTG CTACTTATCG AATGTAAGGC 

351 CTTAGCCGTA AACCAAAATG CACTCAAACA ACTCCTTAGC TATAACTACT 

401 CTATCGGAGC C AC CTGCATT GCTATGGCAG GGAAACACTC TCAAGTGTCA 

451 GC TCTCTTC A ATCCAAAAAC ACAAACTCTT GATTTTTATC CTGGCCTCCC 

501 AGAGTATTCC CAACTCCTAA ACTACTTTAT TTCTTTAAAC TTATAG 



The PSORT algorithm predicts cytoplasm (0.362). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 97 A). The protein 
was used to immunise mice, whose sera were used in a Western blot (Figure 97B) and for FACS 
analysis (Figure 97C). 

These experiments show that cp6882 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 98 

The following C.pneumoniae protein (pid 437 697 9) was expressed <SEQ ID 195; cp6979>: 



1 MSVNPSGNSK NDLWITGAHD QHPDVKESGV TSANLGSHRV TASGGRQGLL 

51 ARIKEAVTGF FSRMSFFRSG APRGSQQPSA PSADTVRSPL PGGDARATEG 

101 AGRNLIKKGY QPGMKVTIPQ VPGGGAQRSS GSTTLKPTRP APPPPKTGGT 

151 NAKRPATHGK GPAPQPPKTG GTNAKRAATH GKGPAPQPPK GILKQPGQSG 

201 TSGKKRVSWS DED* 



1 ATGTCTGTTA ATCCATCAGG AAATTCCAAG AACGATCTCT GGATTACGGG 

51 AGCTCATGAT CAGCATCCCG ATGTTAAAGA ATCCGGGGTT ACAAGTGCTA 

101 ACCTAGGAAG TCATAGAGTG ACTGCCTCAG GAGGACGCCA AGGGTTATTA 

151 GCACGAATCA AAGAAGCAGT AACCGGGTTT TTTAGTCGGA TGAGCTTCTT 

201 CAGATCGGGA GCTCCAAGAG GTAGCCAACA ACCCTCTGCT CCATCTGCAG 

251 ATACTGTACG TAGCCCGTTG CCGGGAGGGG ATGCTCGCGC TACCGAGGGA 

301 GCTGGTAGGA ACTTAATTAA AAAAGGGTAC CAACCAGGGA TGAAAGTCAC 

351 TATCCCACAG GTTCCTGGAG GAGGGGCCCA ACGTTCATCA GGTAGCACGA 

401 CACTAAAGCC TACGCGTCCG GCACCCCCAC CTCCTAAAAC GGGTGGAACT 

451 AATGCAAAAC GTCCGGCAAC GCACGGGAAG GGTCCAGCAC CCCAGCCTCC 

501 TAAAACAGGT GGGACCAATG CTAAGCGCGC AGCAACGCAT GGGAAAGGTC 

551 CAGCACCTCA ACCTCCTAAG GGCATTTTGA AACAGCCTGG GCAGTCTGGG 

601 ACTTCAGGAA AGAAGCGTGT CAGCTGGTCT GACGAAGATT AA 



1 MSLLNI»PSSQ DSASEDSTSQ 

51 KLiNY PKKL 1 1 IEKELKTLFP 

101 THNLGDPKPL LiLjIECKAUW 

151 ALFNPKTQTIi DFYPG1.PEYS 



SQIFDPIRNR 
LLMRKGTL.IP 
NQNALKQLLS 
QLLNYFISLN 

194> is: 



ELVSTPEEKV 
KRRPDILIIT 
YNYSIGATCI 

Li* 



RQRLL SFLMH 
PPTYTDAQGN 
AMAGKHSQVS 



The cp6882 nucleotide sequence <SEQ ID 



The cp6979 nucleotide sequence <SEQ ID 196> is: 



The PSORT algorithm predicts cytoplasm (0.360). 
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The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 98 A). The GST- 
fusion protein was used to immunise mice, whose sera were used in a Western blot (Figure 98B) and 
for FACS analysis (Figure 98C). 

These experiments show that cp6979 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 99 

The following C.pneumoniae protein (pid 4377028) was expressed <SEQ ID 197; cp7028>: 

1 MLLGFLCDCP CASWQCAAVA NCYDSVFMSR PEHKPNIPYI TKATRRGLRM 

51 KTIiAYIaASLK DARQLAYDFL KDPGSLARLA KALIAPKEAL QEGNLFFYGC 

101 SNIEDILEEM RRPHRILLLG FSYCQKPKAC PEGRFNDACR YDPSHPTCAS 

151 CSIGTMMRLN ARRYTTVIIP TFIDIAKHLH TLKKRYPGYQ ILFAVTACEL 

201 SLKMFGDYAS VMNLKGVGIR LTGRICNTFK AFKLAERGVK PGVT IIiEEDG 

251 FEVIiARILTE YSSAPFPRDF CEIH* 

The cp7028 nucleotide sequence <SEQ ID 198> is: 

1 ATGCTTC TAG GGTTTTTGTG TGACTGCCCC TGTGCTTCGT GGCAGTGTGC 

51 GGCCGTTGCT AATTGTTATG ATTCCGTATT TATGTCTAGA CCAGAGCACA 

101 AACCTAATAT TCCTTATATT ACTAAAGCTA CAAGACGGGG TCTGCGTATG 

151 AAGACGCTTG CTTATCTGGC CTCTTTAAAA GATGCTAGAC AGCTTGCCTA 

201 TGATTTTCTG AAAGATCCTG GTTCTTTAGC TCGGTTAGCT AAGGCTTTGA 

251 TAGCTCCTAA GGAGGC CTTA CAGGAGGGCA ACCTATTTTT TTATGGCTGT 

301 AGTAATATTG AGGATATTTT AGAGGAGATG CGTCGTCCTC ATAGAATCCT 

351 TTTGTTAGGA TTTTC TTATT GTCAAAAGCC TAAGGCATGT CCTGAAGGGC 

401 GTTTCAATGA TGCTTGTCGG TATGATCCTT CACATCCTAC ATGTGCCTCA 

451 TGTTCTATAG GGACCATGAT GCGGCTGAAT GCTCGTAGAT AC ACTAC TGT 

501 GATCATCCCT ACATTTATAG ATATCGCAAA ACATTTACAC ACTTTAAAAA 

551 AGCGCTACCC TGGATATCAA ATTCTCTTTG CAGTTACTGC TTGTGAACTT 

601 TCCTTAAAAA TGTTTGGAGA TTATGCCTCC GTAATGAACT TAAAGGGTGT 

651 GG G CAT C AG A CTCACAGGAC GTATTTGCAA TACATTTAAG GCATTTAAAT 

701 T AGC TGAGCG AGGAGTCAAA CCAGGAGTCA CTATCCTAGA AGAAGATGGC 

751 TTTGAGGTAT TAGCAAGGAT TCTTACAGAA TACAGTAGCG CTCCTTTCCC 

801 TAG AG AC T TT TGTGAGATCC ATTAG 

The PSORT algorithm predicts cytoplasm (0.1453). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 99 A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
99B) and for FACS analysis (Figure 99C). 

These experiments show, that cp7028 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 100 

The following C.pneumoniae protein (pid 4377355) was expressed <SEQ ID 199; cp7355>: 

1 MKKWTLSII FFATYCASEL SAVTWAVPL SEAPGK I QVR PWGLQFQEE 
51 QGSVPYSFYY PYDYGYYYPE TYGYTKNTGQ ESRECYTRFE DGTIFYECD* 

The cp7355 nucleotide sequence <SEQ ID 200> is: 

1 ATGAAGAAAG TCGTAACACT ATCC ATT AT A TTTTTCGCAA CGTATTGTGC 

51 ATCAGAGCTT AGTGCTGTAA CTGTAGTGGC TGTGCCTTTA TCAGAGGCTC 

101 CAGGGAAGAT TCAAGTTCGT CCCGTCGTTG GTCTGCAATT TCAAGAAGAA 

151 CAGGGTTCTG TGCCCTATAG TTTTTATTAT CCTTATGACT ATGGGTATTA 

201 CTATCCAGAG ACTTATGGCT ATACTAAAAA TACAGGTCAA GAAAGTCGCG 
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251 AATGTTATAC CCGATTTGAA GATGGCACAA TTTTTTATGA ATGCGATTAG 

The PSORT algorithm predicts inner membrane (0.143). 

The protein was expressed in E.coli and purified as a GST-fusion (Figure 100A) and a his-tag 
product. The proteins were used to immunise mice, whose sera were used in a Western blot (Figure 
100B) and for FACS analysis (Figure 100C). 

These experiments show that cp7355 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 101 

The following C.pnewnoniae protein (pid 4377380) was expressed <SEQ ID 201; cp7380>: 
1 VHYCERTLDP kyilkialkl rqslslffqn sqslqrayst pysyyriilq 

51 KENKEKQALA RHKCISILEF FKNLLFVHL.L SLSKNQREGC STDMAWSTP 

101 FFNRNLWYRL LSSRFSLWKS YCPRFFLDYL EAFGLLSDFL DHQAVIKFFE 

151 LETHF S YYPV SGFVAPHQYIj SLLQDRYFPI ASVMRTLDKD NFSLTPDLIH 

201 DL.LGHVPWLL HPSFSEFFIN MGRLFTKVIE KVQALPSKKQ RIQTLQSNLI 

251 AIVRCFWFTV ESGLIENHEG RKAYGAVLIS SPQELGHAFI DNVRVLPLEL 

301 DQIIRLPFNT STPQETLFSI RHFDELVELT SKLEWMIiDQG LLESIPLYNQ 

351 EKYL> SGFEVIj CQ* 

The cp7380 nucleotide sequence <SEQ ID 202> is: 

1 GTGCACTACT gcgagagaac cctggaccca aagtatattc TGAAGATTGC 

51 TCTAAAGCTG AGACAATCAC TTTCCCTGTT CTTCCAGAAC AGCCAATCAC 

101 TCCAACGTGC AT ACTCGAC C CCATATTCCT ACT AC CGAAT CATTCTACAA 

151 AAGGAAAATA AAGAGAAGCA AGCTTTAGCT CGACACAAAT GCATTTCTAT 

201 TTTAGAATTT TTCAAAAACT TACTCTTTGT TCATCTTCTG TCATTATCAA 

251 AGAATCAAAG GGAAGGTTGC TCC AC TGATA TGGCTGTTGT AAGCACTCCC 

301 TTTTTTAATC GGAATTTATG GTATCGACTC CTTTCCTCAC GGTTTTCTCT 

351 ATGGAAAAGC TATTGTC CAA GATTTTTTCT TGATTACTTA GAAGCTTTCG 

401 GTCTCCTTTC TGATTTC TTA GACCATCAAG CAGTCATTAA ATTCTTCGAA 

451 TTAGAAACAC ATTTTTCCTA TTATCCCGTT TCAGGATTTG TAGCTCCCCA 

501 TCAATACTTG TCTCTGTTGC AGGACCGTTA CTTTCCCATT GCCTCTGTAA 

551 TGCGAACTCT CGATAAAGAT AATTTCTCCT TAACTCCTGA TCTCATCCAT 

601 GACCTTTTAG GGCACGTGCC TTGGCTTCTA CATCCCTCAT TTTCTGAATT 

651 TTTCATAAAC ATGGGAAGAC TCTTCACTAA AGTCATAGAA AAAGTACAAG 

701 CTCTTCCTAG TAAAAAACAA CGCATACAAA CCCTACAAAG CAATCTGATC 

751 GCTATTGTAC GCTGCTTTTG GTTTACTGTT GAAAGCGGAC TTATTGAAAA 

801 CCATGAAGGA AGAAAAGCAT ATGGAGCCGT TCTTATCAGT TCTCCTCAGG 

851 AACTTGGACA CGCTTTCATT GATAACGTAC GTGTTCTCCC TTTAGAATTG 

901 GATCAGATTA TTCGTCTTCC CTTCAATACA TCAACTCCAC AAGAGACTTT 

951 ATTTTCAATA AGACATTTTG ATGAACTGGT AGAACTCACT TCAAAATTAG 

1001 AATGGATGCT CGACCAAGGT CTGTTAGAAT CAATTCCCCT TTACAATCAA 

1051 GAGAAATATC TTTCTGGTTT TGAGGTACTT TGCCAATGA 

The PSORT algorithm predicts inner membrane (0.1362). 

The protein was expressed in E.coli and purified as a GST-fusion product (Figure 101A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
101B) and for FACS analysis (Figure 101C). 

These experiments show that cp7380 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 



Example 102 



The following C.pnewnoniae protein (pid 4376904) was expressed <SEQ ID 203; cp6904>: 
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1 MMNYEDAKLR GQAVAILYQI GAIKFGKHIIi ASGEETPLYV DMRLVISSPE 
51 VLQTVATLIW RLRPSFNSSL IjCGVPYTALiT LATSISLKYN IPMVLRRKEL 
101 QNVDPSDAIK VEGLFTPGQT CLVINDMVSS GKSIIETAVA LEENGLWRE 
151 ALVFLDRRKE ACQPIX3PQGI KVSSVFTVPT LIKALIAYGK LSSGDLTLAN 
201 KISEILEIES * 

The cp6904 nucleotide sequence <SEQ ID 204> is: 

1 ATGATGAACT ACGAAGATGC AAAATTACGC GGTCAAGCTG TAGCAATTCT 

51 ATACCAAATC GGAGCTATAA AGTTCGGAAA ACATATTCTC GCTAGCGGAG 

101 AAGAAACTCC TCTGTATGTA GATATGCGTC TTGTGATCTC CTCTCCAGAA 

151 GTTCTCCAGA CAGTGGCAAC TCTTATTTGG CGCCTCCGCC CCTCATTCAA 

201 TAGTAGCTTA CTCTGCGGAG TCCCTTATAC TGCTCTAACC CTAGCAACCT 

251 CGATCTCTTT AAAATATAAC ATCCCTATGG TATTGCGAAG GAAGGAATTA 

301 CAGAATGTAG ACCCCTCGGA CGCTATTAAA GTAGAAGGGT TATTTACTCC 

351 AGGACAAACT TGTTTAGTCA TCAATGATAT GGTTTC CTCA GGAAAATCTA 

401 TAATAGAGAC AGCAGTCGCA CTGGAAGAAA ATGGTCTGGT AGTTCGTGAA 

451 GCATTGGTAT TCTTAGATCG TAGAAAAGAA GCGTGTCAAC CACTTGGTCC 

501 ACAGGGAATA AAAGTCAGTT CGGTATTTAC TGTACCCACT CTGATAAAAG 

551 CTTTGATCGC TTATGGGAAG CTAAGCAGTG GTGATCTAAC CCTGGCAAAC 

601 AAAATTTCCG AAATTCTAGA AATTGAATCT TAA 

The PSORT algorithm predicts cytoplasm (0.0358). 

The protein was expressed in E.coli and purified as a his-tag product (Figure 102A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
102B) and for FACS analysis. 

The cp6904 protein was also identified in the 2D-PAGE experiment. 

These experiments show that cp6904 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 103 

The following C.pneumoniae protein (pid 4376964) was expressed <SEQ ID 205; cp6964>: 

1 MKKLIALIGI FLVPIKGNTN KEHDAHATVL KAARAKYNLF FVQDVFPVHE 
51 VIEPISPDCI, VHYEGWV* 

The cp6964 nucleotide sequence <SEQ ID 206> is: 

1 . ATGAAAAAAT TGATTGCTTT GATAGGGATA TTTCTTGTTC CAATAAAAGG 
51 AAAT AC CAAT AAGGAACACG ACGCTCACGC GACTGTTTTA AAAGCGGCCA 
101 GAGCAAAGTA TAATTTGTTC TTTGTTCAGG ATGTTTTCCC TGTACACGAA 
151 GTTATCGAGC CTATTTCTCC CGATTGCCTG GTACATTATG AAGGGTGGGT 
201 TTGA 

The PSORT algorithm predicts inner membrane (0.091). 

The protein was expressed in E.coli and purified as a GST-fusion product (Figure 103A) and also in 
his-tagged form. The recombinant proteins were used to immunise mice, whose sera were used in a 
Western blot (Figure 103B) and for FACS analysis (Figure 103C). 

These experiments show that cp6964 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 104 



The following C.pneumoniae protein (pid 4377387) was expressed <SEQ ID 207; cp7387>: 
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1 LNFAKIDHNH LYI/TCLGDLG 

51 ISGEPSRLAT SGNDTYYSIV 

101 AVLSHGTREA KEIPGSSKDY 

151 YSQCTKVTKT NLKEQYRHLS 



VACP1LSTDC LPNYS EKASH EVLVYSKFRC 
SLPIGLRYEV TSPSGRHDFN I DMHVAPK IG 
AFFSLTARES LMISEKLAMT FQVSEVIQNC 
HNTGFELSVK SAF* 



The cp7387 nucleotide sequence <SEQ ID 



208> is: 



1 TTGAATTTTG CAAAGATTGA TCACAATCAT CTCTACCTTA CATGTTTGGG 

51 AGATCTTGGT GTAGCTTGTC CTATACTTTC TACAGATTGT CTACCTAATT 

101 ATAGCGAGAA AGCATCTCAT GAGGTTCTTG TTTATAGTAA ATTTAGATGC 

151 ATTTCTGGAG AGCCATCTCG ACTTGCAACT TCAGGAAATG ACACATATTA 

201 TTCTATAGTA AGTTTACCTA TAGGACTCCG TTACGAAGTG ACTTCACCAT 

251 CAGGACGTCA TGATTTCAAT ATTGATATGC ATGTAGCTCC AAAGATAGGT 

301 GCAGTACTCT CTCATGGAAC ACGAGAGGCT AAAGAGATCC CAGGATC TTC 

351 AAAAG AC TAT GCATTTTTTA GCTTGACTGC TAGAGAAAGT TTAATGATTT 

401 CTGAAAAGCT TGCGATGACT TTCCAAGTTA GCGAAGTTAT TCAGAATTGT 

451 TATTCACAAT GT AC TAAAGT AACGAAAACT AATTT AAAAG AACAGTATAG 

501 G C AC TTATCC CACAATACAG GGTTTGAGTT AAGCGTCAAG TCTGCATTCT 

551 AA 



The PSORT algorithm predicts inner membrane (0.043). 

The protein was expressed in Rcoli and purified as a his-tagged-fusion product (Figure 104A) and 
also as a GST-fusion (Figure 104B). The recombinant proteins were used to immunise mice, whose 
sera were used in a Western blot and for FACS analysis (Figure 104C; his-tagged). 

These experiments show that cp7387 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 105 

The following C. pneumoniae protein (PID 4376281) was expressed <SEQ ID 209; cp628 1>: 



1 MFLQFFHPIV FSDQSLSFLP YLGKSSGIIE KCSNIVEHYL HLGGDTSVII 

51 TGVSGATFLS VDHALPISKS EKIIKILSYI LILPLILALF IKIVLRIII*F 

101 FKYRGLILDV KKEDLKKTLT PDQENLSLPL PSPTTLKKIH ALHILVRSGK 

151 TYNELIQEGF SFTK I TDIX5Q APSPKQDIGF SYNSLLPNFY FHSLVSVPNI 

201 SGEERALNYH KEQQEEMAVK LKTMQACSFV FRSLHIiPSMQ TKDKKAGFGL 

251 LTFFPWKIYP L* 



1 ATGTTTCTTC AGTTTTTTCA TCCTATAGTC TTCTCGGATC AGTCC TTATC 

51 TTTTCTTCCT TACCTAGGAA AAAGCTCTGG CATTATTGAA AAATGTTCCA 

101 ATATCGTTGA ACACTATTTA CATTTGGGAG GAGACACTTC TGTTATCATC 

151 ACAGGAGTTT CTGGAGCTAC CTTTCTATCT GTTGATCATG CCCTCCCAAT 

201 CTCGAAATCT GAAAAAATAA TAAAAATTCT CTCCTATATT TTAATTCTTC 

251 CTCTGATTCT AGCTCTCTTT ATT AAGATC G TTTTACGCAT TATCTTATTC 

301 TTCAAGTATC GTGGTCTAAT CCTAGATGTT AAGAAGGAGG ATTTGAAAAA 

351 AACACTTACA CCTGACCAAG AAAACCTCAG TCTTCCTTTA CCATCTCCTA 

401 CAACATTAAA GAAAATTCAT GCGCTACACA TTTTAGTGCG TTCTGGAAAA 

451 ACCTATAACG AGCTTATACA AGAAGGGTTT TC TTTC ACTA AAATCACAGA 

501 TCTTGGTCAA GCTCCTTCAC CAAAGCAAGA TATTGGCTTC TCTTATAATT 

551 CCCTTCTCCC TAACTTCTAT TTTC ATTCC T TGGTATCTGT TCCAAATATT 

601 TCAGGCGAGG AACGGGCTCT TAATTATCAT AAAGAACAAC AAGAGGAAAT 

651 GGCTGTTAAA TTAAAAACAA TGCAAGCGTG TTCTTTTGTC TTCCGATCCC 

701 TGCATTTACC TTCAATGCAA AC G AAGGAC A AAAAGGCTGG ATTTGGACTA 

751 CTGACGTTTT TCCCTTGGAA AATCTACCCC CTATAA 



The PSORT algorithm predicts inner membrane (0.5373). 

The protein was expressed in E.coli and purified as a GST-fusion product (Figure 105A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
105B) and for FACS analysis. 



The cp6281 nucleotide sequence <SEQ ID 210> is: 
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These experiments show that cp6281 is a surface-exposed and immunoaccessible protein, and that it 
is a useful inimunogen. These properties are not evident from the sequence alone. 

Example 106 and 
Example 107 

The following C.pneumoniae protein (pid 4376306) was expressed <SEQ ID 21 1; cp6306>: 

1 MGNHETYIHP GVLPSSHAQD VSRSTVYPSR S F I MRRMLMG WNFNRVPSKS 
51 SEQLMDGHRI PL IFFGKHHP TISILNVNRF SWLSIFYNGE RGF* 

The cp6306 nucleotide sequence <SEQ ID 212> is: 



1 ATGGGAAACC ATGAGACCTA TATACATCCA GGAGTGCTCC CGAGTAGTCA 

51 TGCTCAGGAT GTTAGCAGAT CTACAGTTTA CCCCAGTCGA AGTTTTATCA 

101 TGAGACGTAT GCTCATGGGC TGGAATTTCA ATCGTGTTCC CTCGAAGAGC 

151 TCCGAGCAGT TAATGGATGG TCATCGCATA CCTCTTATAT TTTTTGGGAA 

201 GCATCATCCT ACTATATCTA TTTTAAATGT CAATAGATTT TCTTGGCTCT 

251 CCATTTTTTA CAATGGAGAA AGGGGGTTTT GA 



The PSORT algorithm predicts cytoplasm (0.167). 

The following C.pneumoniae protein (pid 4376434) was also expressed <SEQ ID 213; cp6434>: 



1 ATGTC TGAAA GTATTAACAG AAGCATTCAT TTAGAAGCCT CTACACCATT 

51 TTTTATAAAA TTAACGAATC TCTGTGAAAG TAGATTAGTT AAGATCACTT 

101 CTCTTGTTAT TTCTCTATTA GCTTTAGTGG GTGCGGGAGT CACTC TTGTG 

151 GTTTTATTTG TAGCTGGGAT CCTTCCTTTA CTTCCTGTAC TCATCTTAGA 

201 AATTATTTTA ATAACCGTCC TTGTCTTGCT TTTTTGTTTG GTATTGGAAC 

251 CTTATTTAAT AGAAAAACCT AGTAAAATAA AGGAAC TACC TAAAGTAGAC 

301 GAGCTATCTG TAGTAGAAAC GGACAGTACT CTTTAA 



The PSORT algorithm predicts inner membrane (0.6859). 

The proteins were expressed in E.coli and purified as his-tag products (Figure 106A; 6306 = lanes 
2-4; 6434 = lanes 8-10). The recombinant proteins were used to immunise mice, whose sera were 
used in Western blots (Figures 106B & 107) and for FACS analysis. 

These experiments show that cp6306 & cp6434 are surface-exposed and immunoaccessible proteins, 
and that they are useful immunogens. These properties are not evident from the sequences alone. 

Example 108 

The following C.pneumoniae protein (pid 4377400) was expressed <SEQ ID 215; cp7400>: 



1 MSESINRSIH LEASTPFFIK LTNLCE SRLV 
51 VLFVAG I L PL LPVLILEIIL I TVLVLLFC L 
101 ELSWETDST I** 

The cp6434 nucleotide sequence <SEQ ID 214> is: 



KITSLVISLL 
VLEPYLIEKP 



AliVGAGVTLV 
SKI KELPKVD 



1 MRVMRFFCLF FLGFLG S FHC VAEDKGVDLF GVWDDNQ I TE CDDSYMTEGR 
51 EEVEKWDA 



The cp7400 nucleotide sequence <SEQ ID 216> is: 



1 GTGAGAGTTA TGAGATTTTT TTGTCTATTT TTTCTTGGGT TCCTAGGATC 

51 TTTTCATTGT GTTGCTGAAG ACAAGGGCGT GGATTTATTT GGAGTCTGGG 

101 ACGATAACCA AATTACAGAG TGTGACGATA GTTACATGAC AGAGGGTCGT 

151 GAAGAGGTTG AAAAGGTAGT GGAC GCTT AG 



The PSORT algorithm predicts periplasmic space (0.924). 
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The protein was expressed in E.coli and purified as a GST-fusion product (Figure 108A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
108B) and for FACS analysis. 

These experiments show that cp7400 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 109 

The following C.pneumoniae protein (PID 4376395) was expressed <SEQ ID 21 7 ; cp6395>: 

1 MENAMSSSFV YNGPSWILKT SVAQEVFKKH GKGIQVLLST SVMLFIGLGV 

51 CAFIFPQYLI VFVLTIALLM LAISLVLFLL IRSVRSSMVD RLWCSEKGYA 

101 LHQHENGPFIi DVKRVQQILL RSPYIKVRAL WPSGDIPEDP SQAAVLLLSP 

151 WTFFSSVDVE ALLPSPQEKE GKYIDPVIiPK LSRIERVSLL VFLSAFTLDD 

201 LNEQGVNPLM NNEEFLFFIN KKAREHGIQD LKHEIMSSLE KTGVPLDPSM 

251 SFQVSQAMFS VYRYLRQRDLi TTSELRCFHL LSCFKGDWH CLASFENPKD 

301 LADSDFLEAC KNVEWGEFIS ACEKALLKNP QGISIKDLKQ FLVR* 

The cp6395 nucleotide sequence <SEQ ID 21 8> is: 

1 ATGGAGAATG CTATGTCATC ATCGTTTGTG TATAATGGGC C TTCGTGG AT 

51 TTTAAAAACG TCAGTAGCTC AGGAGGTATT TAAAAAGCAC GGTAAGGGGA 

101 TTCAGGTTCT CTTAAGTACT TCAGTGATGC TTTTTATAGG TCTTGGAGTC 

151 TGTGCCTTTA TATTTCCTCA AT ATC TG ATT GTTTTTGTTT TGACTATAGC 

2 01 TTTGC TTATG CTCGCTATAA GCTTGGTATT GTTTCTCTTA ATACGTTCTG 
251 TACGCTCTTC AATGGTAGAT CGTTTGTGGT GTTCTGAAAA AGGATATGCT 

3 01 CTTCATCAAC ATGAGAACGG GCCTTTTTTG GATGTGAAGC GTGTACAGCA 
351 AATTCTTCTA AGATCACCCT ATATTAAAGT TCGGGCTTTA TGGCCGTCTG 
401 GAG AT ATC CC TGAGGATCCT TCACAAGCTG CGGTTCTATT ACTTTCTCCT 
451 TGGACTTTCT TTTCATCCGT GGATGTAGAG GCTTTATTAC CGAGTCCTCA 
501 AGAAAAGGAG GGTAAGTATA TAGATC CTGT GCTGCCTAAG TTGTCTAGGA 
551 TAGAGAGAGT CTCACTTTTA GTGTTTTTGA GTGCATTTAC TTTGGATGAC 
601 TTAAACGAAC AGGGAGTCAA TCCTTTGATG AATAATGAGG AATTTTTATT 
651 TTTTATAAAT AAGAAAGCGC GTGAGCATGG GATTCAGGAT TTAAAACACG 
701 AGATTATGTC TTCGTTAGAG AAAACAGGAG TGCCATTAGA CCCCTCAATG 
751 AGTTTTCAAG TTTCACAAGC GATGTTTTCT GTATATCGCT ACTTGAGACA 
801 AAGGGATTTA ACGACTTCAG AATTAAGATG TTTTCACCTC TTAAGTTGTT 
851 TTAAAGGGGA TGTGGTTCAT TGTTTAGCTT CATTTGAAAA CCCTAAAGAT 
901 TTAGCAGATT CTGACTTTTT AGAAGCTTGT AAGAACGTGG AATGGGGTGA 
951 GTTTATTTCG GCATGTGAGA AGGCTCTTTT AAAGAATCCG CAAGGAATTT 

1001 CCATTAAGGA TCTAAAACAA TTTTTAGTGA GGTAA 

The PSORT algorithm predicts inner membrane (0.6307). 

The protein was expressed in E.coli and purified as a GST-fusion product (Figure 109 A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
109B) and for FACS analysis. 

These experiments show that cp6395 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 



Example 110 

The following C.pneumoniae protein (PID 4376396) was expressed <SEQ ID 219; cp6396>: 

1 MIEFAFVPHT SVTADRIEDR MACRMNKLST LAITSLCVLI SSVCIMIGIL 

51 C I SGTVGTYA FWGIIFSVL ALVACVFFLY FFYFSSEEFK CASSQEFRFL 

101 PIPAWSALR SYEYISQDAI NDVIKDTMQL STLSSLLDPE AFFLEFPYFN 

151 SblVNHSMKE ADRLSREAFL ILLGEITWKD CETKILPWLK DPNITPDDFW 

201 KLLKDHFDLK DFKKRIATWI RKAYPEIRL.P KKHCLDKSIY KGCCKFLLLS 
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251 ENDVQYQRLL HKVCYFSGEF PAMVLGLGSE VPMVLGLPKV PKDLTWEMFM 
301 ENMPVLLQSK REGHWKISLE DVASli* 

The cp6396 nucleotide sequence <SEQ ID 220> is: 

1 ATGATCGAGT TTGCTTTTGT TCCTCATACC TCCGTGACAG CGGATCGGAT 

51 TGAGGATCGC ATGGCCTGTC GCATGAACAA GTTGTCTACT TTAGCAATTA 

101 CAAGTCTTTG TGTATTGATC AGTTCAGTTT GTATTATGAT TGGGATTTTA 

151 TGCATTTCTG GAACGGTTGG GAC CTATGC A TTTGTTGTAG GAATTATTTT 

2 01 TTCTGTGCTT GCTTTGGTAG CATGTGTTTT CTTTCTTTAT TTCTTTTATT 
251 TTTCTTCTGA GGAATTTAAG TGTGCTTCTT CGCAGGAGTT TCGTTTTTTG 

3 01 CC T AT AC C AG CTGTGGTTTC TGCATTGCGT TCCTATGAAT AC AT TTCTC A 
351 GGACGCTATC AATGACGTTA TAAAAGATAC GATGCAGTTG TCTACCCTTT 
401 CTTCTCTTTT AGATCCCGAA GCTTTTTTCT T AG AATTTC C TTATTTTAAC 
451 TCTTTGATAG TGAATCATTC GATGAAGGAA GCGGATCGTT TGTCTCGAGA 
501 GGCTTTTTTG ATTTTATTAG GTGAGATTAC TTGGAAGGAT TGTGAAACAA 
551 AAATTTTGCC ATGGTTGAAA GATCCTAATA TCACTCCTGA TGATTTCTGG 
601 AAGC TATTAA AAGACCATTT CGATTTAAAG GACTTTAAGA AGAGGATCGC 
651 CACTTGGATA CGGAAGGCCT ATCCAGAAAT TAGATTACCG AAGAAGCATT 
701 GTTTAGATAA GTCTATCTAT AAGGGGTGTT GTAAGTTTTT ATTACTTTCT 
751 GAGAATGATG TGCAATATCA GAGGTTATTA CATAAGGTCT GTTATTTCTC 
801 TGGGGAGTTT CCTGCCATGG TTTTAGGTTT GGGAAGTGAA GTGCCTATGG 
851 TGTTAGGACT CCCTAAGGTT CCCAAGGATC TTAC CTGGGA GATGTTTATG 
901 GAAAATATGC CTGTTC TTCT GCAAAGCAAA AGAGAGGGGC ATTGGAAAAT 
951 CTC CTTGGAA GACGTAGCCT CTCTTTAA 

The PSORT algorithm predicts inner membrane (0.6095). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 110A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
1 10B) and for FACS analysis. 

These experiments show that cp6396 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 111 

The following Cpneumoniae protein (PID 4376408) was expressed <SEQ ID 221; cp6408>: 

1 MNTSLKRPLK SHFDWG SFL RPEHLKKTRE SLKEGSISLD QLMQIEDIAI 

51 QDLIKKQKAA GLSFITDGEF RRATWHYDFM WGFHGVGHHR ATEGVFFDGE 

101 RAM I DDT YLT DKISVSHHPF VDHFKFVKAL EDEFTTAKQT LPAPAQFLKQ 

151 MIFPNNIEVT RKFYPTNQEL IEDXVAGYRK V IRDLYDAGC RYLQLDDCTR 

201 GGLVDPRVCS WYGIDEKGLQ DLIQQYLLIN NLVIADRPDD LWNLHVCRG 

251 NYHSKFFASG SYDFIAKPLF EQTNVDGYYL EFDHERSGDF SPLTFISGEK 

301 TVCLGLVTSK TPTLENKDEV IARIHQAADY LPLERLSLSP QCGFASCEIG 

351 NKLTEEEQWA KVALVKEISE EVWK* 

The cp6408 nucleotide sequence <SEQ ID 222> is: 



1 


ATGAATACTT 


51 


TAGTTTTTTG 


101 


AAGGCTCTAT 


151 


CAAGATTTGA 


201 


TGGAGAATTC 


251 


ATGGCGTAGG 


301 


CGCGCTATGA 


351 


CCACCCATTT 


401 


TTACGACTGC 


451 


ATGATCTTCC 


501 


TCAGGAGCTA 


551 


ATCTTTATGA 


601 


GGAGGTTTAG 


651 


AGGTCTTCAA 


701 


TTGCAGATCG 
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751 AACTACC AC T CAAAATTCTT TGCTAGTGGT AGTTATGACT TTATTGCAAA 

801 GCCCCTATTC GAACAAACAA ATGTAGACGG CTACTATTTA GAGTTTGATC 

851 ATGAGCGTTC TGGAGACTTC TCTCCTCTCA CCTTCATTTC TGGAGAAAAA 

901 ACTGTCTGCT TAGGTCTTGT TACCAGCAAA ACCCCTACAC TTGAAAATAA 

951 GGATGAGGTC ATTGCTCGCA TACATCAAGC AGCAGACTAC CTGCCCTTGG 

1001 AAAGACTCTC TCTAAGTCCA CAGTGTGGTT TTGCTTCATG TGAAATAGGA 

1051 AATAAATTAA CAGAAGAAGA GCAATGGGCT AAAGTTGCTC TAGTAAAAGA 

1101 AATTTCCGAA GAAGTTTGGA AATAA 

The PSORT algorithm predicts cytoplasm (0.2171). 

The protein was expressed in E.coli and purified as a GST- fusion product (Figure 1 1 1 A) and also as 
a his-tagged product. The his-tag protein was used to immunise mice, whose sera were used in a 
Western blot (Figure 1 1 IB) and for FACS analysis. 

These experiments show that cp6408 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 112 

The following C. pneumoniae protein (PID 4376430) was expressed <SEQ ID 223; cp6430>: 

1 MKLYSISSDV DTPWIFQLMS KVDSYLFLGG NRIKWSIVM QEPNLIIGKV 

51 ENVRISTIVK ILKILSFLIF PL I L I AL AIiH YFLHAKYANH LLVSKILERA 

101 PQYVPIPGRS GDTASHYKLT TLVPVSQKNL QAMGSNPLEV EAALRTTKPS 

151 FFCVPAXYRQ IIISSHGIRF SLDIiEQLADD INLDSVSWPT EYLNSTMDFC 

201 SKADKRVIQN VQNLRTGTYI NSVGKRSLLK FMLQHLFIDG ITQENPEALP 

251 NNTSGRLTLF PSVRYIYSHF TPQNPTIWPQ VFFRQGPLDE DRGGGFEILE 

301 QLQELGVRFP ICPSQGPDNP NFQGFQGIRI YWEDSYQPNK EV* 

The cp6430 nucleotide sequence <SEQ ID 224> is: 

1 ATGAAACTTT ATAGCATCTC TTCAGATGTA GATACACCTT GGATATTTCA 

51 GCTTATGTCA AAGGTAGATT CTTATCTTTT CTTAGGCGGG AATAGAATCA 

101 AGGTTGTATC TATAGTTATG CAAGAACCTA ACTTAATTAT TGGAAAAGTA 

151 GAAAACGTTC GGATCTCCAC AATAGTGAAA ATATTAAAGA TTTTATCCTT 

201 C TT AATCTTC CCTCTGATTT TAATCGCTTT AGCCCTACAC TATTTTCTAC 

251 ATGCTAAATA TGCTAATCAC TTACTTGTAT CTAAGATTTT AGAAAGAGCT 

301 CCTCAGTATG TGCCTATTCC TGGTCGTTCA GGAGACACGG CGTCTCATTA 

351 TAAATTAACA ACATTGGTTC CAGTATCCCA AAAAAATCTA CAAGCTATGG 

401 GATCAAATCC TCTAGAAGTT GAAGCGGCTC TTCGAACTAC AAAACCCTCT 

451 TTTTTCTGTG TACCTGCAAA AT AC CGTC AG ATTATAATTT CAAGTCACGG 

501 CATTCGCTTT TCTTTAGATC TTGAACAACT TGCTGATGAC ATTAATTTAG 

551 ATTCGGTTTC CTGGCCTACG GAGTATCTTA ACTCTACTAT GGATTTTTGC 

601 AGCAAGGCAG ATAAACGTGT TATACAGAAT GTACAAAATC TGCGGACAGG 

651 AACTTACATA AATTC TGT AG GAAAGCGTAG CCTTTTAAAA TTCATGTTAC 

701 AGC AC CTATT TATTGATGGG ATCACACAAG AAAACCCTGA AGCCCTTCCT 

751 AACAATACAT CTGGAAGACT GACTCTATTC CCTAGTGTTC GTTATATCTA 

801 TTCTCATTTT ACTCCACAAA ATCCTACAAT ATGGCCGCAA GTCTTTTTCA 

851 GACAAGGTCC TCTAGATGAA GATCGAGGAG GAGGATTTGA GATCTTAGAG 

901 CAATTACAAG AGTTAGGAGT TAGGTTTCCA ATTTGCCCCT CTCAAGGACC 

951 AGACAATCCT AATTTTCAAG GTTTTCAAGG GATTCGTATC TATTGGGAAG 

1001 ATTC CTATCA ACCCAATAAG GAGGTTTAA 

The PSORT algorithm predicts inner membrane (0.5140). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 112A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
1 12B) and for FACS analysis. 

These experiments show that cp6430 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 
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Example 113 

The following C.pnewnoniae protein (PID 4376439) was expressed <SEQ ID 225; cp6439>: 



1 MSYDTCjFKNL EKEDSVHKIC NEIFALVPRIi NT I AC TEA 1 1 KNLPKADIHV 

51 HLPGTTTPQL AWILGVKNGF LKWSYNSWTN HRLL S PKNPH KQYSNIFRNF 

101 QDICHEKDPD LSV3JQYNILN YDFNSFDRVM ATVQGHRFPP 6GIQNEEDLL 

151 LIFNNYZiQQC LDDT IVYTEV QQNIRIjAHVTj YPSLPEKHAR MKFYQILYRA 

201 SQTFSKHGIT LRFLNCFNKT FAPQINTQEP AQEAVQWLQE VDSTFPGLFV 

251 GIQSAGSESA PGACPKRLAS GYRNAYDSGF GCEAHAGEG I ETRTIFSSAK 

301 VNPEGLIEIT RVTFSSLKRK QPSSLPIRVT CQLG* 



1 ATGTCTTATG ATACGTTATT CAAGAATCTT GAAAAGGAAG ATTCTGTACA 

51 TAAGATATGC AATGAGATCT TTGCATTAGT ACCACGACTC AATACAATCG 

101 CTTGCACCGA AGCTATCATC AAAAACCTCC CCAAAGCAGA TATCCATGTA 

151 CACCTTCCTG GGACCATAAC ACCTCAATTA GCTTGGATTT TAGGTGTGAA 

201 AAATGGGTTC TTAAAATGGT CTTATAATTC TTGGACCAAT CATCGATTAC 

251 TTTCTCCTAA GAATCCTCAT AAACAATACT CCAATATTTT CCGAAACTTT 

301 CAAGATATCT GTCACGAAAA GGATCCGGAT TTAAGTGTAT TACAATATAA 

351 TATCTTAAAT TACGATTTTA ATAGCTTTGA TAGAGTGATG GCTACAGTAC 

401 AAGGACATCG CTTTCCTCCT GGAGGAATCC AAAATGAAGA AGACCTTCTT 

451 CTCATTTTCA ATAACTATCT CCAGCAATGT CTGGACGATA CTATCGTGTA 

501 TACTGAAGTA CAACAAAATA TCCGCCTTGC CCATGTTTTG TATCCTTCAT 

551 TACCTGAAAA GCACGCGCGT ATGAAGTTTT ATCAAATCTT GTATCGTGCT 

601 TCGCAAACGT TTTCAAAACA CGGGATTACT TTACGATTTT TAAACTGCTT 

651 . CAATAAAACA TTTGCTCCAC AAATAAACAC ACAAGAACCT GCCCAAGAAG 

701 CTGTTCAATG GCTCCAAGAG GTTGATTCTA CATTTCCTGG TCTATTTGTA 

751 GGGATACAAT CCGCAGGATC AGAATCTGCG CCCGGAGCCT GTCC TAAGCG 

801 ATTAGCTTCT GGATATAGAA ATGCTTATGA CTCAGGGTTT GGTTGTGAAG 

851 CTCATGCTGG AGAAGGCATA GAGACCCGGA CTATTTTTTC GTCAGCTAAG 

901 GTAAATCCAG AGGGATTGAT CGAGATAACC CGAGTGACTT TCTCGTCTCT 

951 TAAACGAAAA CAGCCATCTA GTTTAC CCAT AAGAGTTACT TGCCAGTTAG 

1001 GATAA 



The PSORT algorithm predicts cytoplasm (0.1628). 

The protein was expressed in E.coli and purified as a GST-fusion product (Figure 113A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
1 13B) and for FACS analysis. 

These experiments show that cp6439 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 114 

The following C.pneumoniae protein (PID 4376440) was expressed <SEQ ID 227; cp6440>: 



1 LQSARRHLNT IFILDFGSQY TYVLAKQVRK LFVYCEVLPW NISVQCLKER 

51 APLGlIIiSGG PHSVYENKAP HLDPEIYKIiG IPILAICYGM QliMARDFGGT 

101 VSPGVGEFGY TPIHLYPCEIj FKHIVDCESL DTEIRMSHRD HVTTIPEGFN 

151 VIASTSQCSI SGIENTKQPJj YGLQFHPEVS DSTPTGNKIL ETFVQEICSA 

201 PTLWNPLYIQ QDLVSKIQDT VIEVFDEVAQ SLDVQWLAQG TIYSDVIESS 

251 RSGHASEVIK SHHNVGGLPK NLKLKLVEPL RYIjFKDEVRI LGEALGLSSY 

301 LIjDRHPFPGP GLTIRVIGEI LPEYIiAILRR ADLIFIEELR KAKLYDKI SQ 

351 AFALFIiP IKS VSVKGDCRSY GYTIALRAVE STDFMTGRWA YLPCDVLSSC 

401 SSRIINEIPE VSRWYDISD KPPATIEWE* 



The cp6439 nucleotide sequence <SEQ ID 



226>is: 



The cp6440 nucleotide sequence <SEQ ID 228> is: 



1 TTGCAGAGTG CAAGGAGACA TTTGAACACC 

51 ATCTCAATAT ACTTATOTAT TAGCAAAGCA 

101 ATTGC GAAGT TCTTCCCTGG AATATCTCTG 

151 GCGCCTTTGG GGATCATTCT CTCAGGAGGT 



ATATTTATTC 
AGTGCGGAAG 
TGCAATGTTT 
CCTCACTCTG 



TAGATTTTGG 
TTATTTGTAT 
AAAAGAAAGA 
TCTATGAAAA 
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201 CAAGGCTCCA CATTTAGATC CTGAAATCTA TAAACTTGGC ATTCCAATTC 

251 TAGCTATTTG CTATGGCATG CAGCTTATGG CTAGAGATTT TGGAGGGACT 

301 GTAAGCCCTG GTGTAGGAGA ATTTGGATAT ACGCCCATCC ATCTGTATCC 

351 TTGTGAGCTC TTCAAACACA TCGTCGACTG CGAATCTCTA GACACAGAGA 

401 TTCGGATGAG CCATCGGGAT CATGTTACGA C AATTCC TG A AGGATTTAAT 

451 GTAATCGCAT CCACCTCACA ATGCTCGATC TCAGGAATAG AAAATACCAA 

501 ACAACGGTTG TACGGGCTGC AATTTCATCC CGAGGTTTCT GACTCCACTC 

551 CAACGGGAAA TAAGATTCTA GAAACTTTTG TTCAAGAGAT CTGTTCTGCT 

601 CCCACACTAT GGAATCCCTT GTATATTCAG CAAGACCTTG TAAGTAAAAT 

651 TCAAGATACC GTTATTGAAG TATTTGATGA AGTCGCTCAG TCATTAGACG 

701 TACAATGGTT AGCTCAAGGA ACCATCTACT CAGATGTTAT TGAGTCCTCA 

751 CGCTCTGGAC ATGCCTCCGA AGTAATAAAA TCACATCATA ATG TAG GGGG 

801 GCTTCCAAAA AATCTTAAGC TGAAGTTAGT CGAGCCCTTA CGTTATTTAT 

851 TTAAAGATGA AGTTCGAATT TTAGGAGAAG CCCTAGGACT TTCTAGCTAT 

901 C TCTTGGAC A GGCATCCTTT TCCTGGACCT GGCTTGACAA TTCGTGTGAT 

951 TGGAGAGATC CTTCCTGAAT ATCTAGCCAT TTTACGACGG GCGGACCTCA 

1001 TCTTTATAGA AGAGCTTAGG AAAGCAAAAC TCTACGATAA AATAAGCCAA 

1051 GCCTTTGCTC TATTTCTTCC TATAAAATCA GTATCTGTAA AAGGAGATTG 

1101 TAGAAGCTAT GGTTAT AC C A TAGCATTACG TGCTGTAGAA TCTACAGATT 

1151 TCATGACAGG ACGATGGGCC TACCTTCCAT GCGATGTTCT CAGTTCTTGC 

1201 TCATCGCGAA TTATTAATGA AATACCCGAG GTAAGCCGAG TGGTCTATGA 

1251 TATTTCTGAC AAGCCACCAG CAACTATAGA ATGGGAATAG 

The PSORT algorithm predicts cytoplasm (0.0481). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 1 14A) and also as 
a his-tagged product. The recombinant proteins were used to immunise mice, whose sera were used 
in a Western blot (Figure 1 14B) and for FACS analysis. 

These experiments show that cp6440 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 115 

The following C.pneumoniae protein (PID 437 6475) was expressed <SEQ ID 229; cp6475>: 

1 MNTYTFSPTL QKSFSLFLIiE KLDSYFFFGG TRTQILVITP TNIRIiAAKKR 

51 GCKVSTIEKI IKILSFILLP LVIIAFIIiRY FLHKKFDKQF LCIPKVISNE 

101 DEALLGSRPQ AVEKAVREIS PAFFSIPRKY QLIRIDTPKD DAPSILFPIG 

151 IEIIIiKDLCI DTLKQSNLFL KREMDFLGHP EEKALFDSIC SIEKDQEWMS 

201 LE SKKltlt I TH FLKYLFVSGI EQLNPGFNPE NGRGYFSEIS TAKIHFHQHG 

251 RYGPIRSSGP IMKEI* 

The cp6475 nucleotide sequence <SEQ ID 230> is: 

1 ATGAATACCT ATACCTTCTC TCCTACACTT CAGAAAAGCT TCAGCCTATT 

51 TCTTTTAGAA AAATTAGACT CTTACTTTTT CTTTGGAGGG ACTCGTACAC 

101 AAATCTTAGT CATCACACCA ACCAATATTA GATTAGCAGC TAAAAAAAGA 

151 GGGTGTAAGG TTTCTACTAT AGAAAAGATA ATCAAGATCC TCTCTTTTAT 

201 CCTGCTGCCC CTAGTTATCA TTGCCTTTAT ACTTCGCTAT TTCTTACATA 

251 AGAAATTCGA TAAACAGTTC TTGTGTATCC CAAAAGTCAT TTCTAACGAA 

301 GACGAAGCTC TTCTTGGATC TAG AC CACAA GCAGTTGAAA AAGCAGTTCG 

351 AGAAATATCT CCAGCCTTCT TCTCTATACC AAGAAAATAC CAACTTATTA 

401 GAATCGACAC TCCTAAAGAT GACGCTCCCT CAATCCTTTT CCCTATAGGC 

451 ATAGAGATCA TTCTCAAAGA T TT ATG T ATT GATACACTCA AGCAATCTAA 

501 TCTTTTCCTT AAAAGAGAAA TGGATTTCTT AGGTCATCCA GAAGAAAAAG 

551 CATTATTCGA CTCGATATGT TCTATAGAAA AAGATCAAGA ATGGATGAGC 

601 TTGGAAAGTA AAAAACTTTT AATCACGCAC TTCCTAAAGT ATCTCTTTX3T 

651 CTCTGGAATC GAACAACTAA ATCCAGGCTT TAACCCAGAG AATGGGCGTG 

701 GGTATTTTTC AGAAATAAGT ACAGCAAAGA TCCATTTTCA TCAGCACGGT 

751 C G AT ATGGGC CAATCCGTTC TTCGGGACCC ATCATGAAGG AAATATAA 



The PSORT algorithm predicts inner membrane (0.5373). 
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The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 115 A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
1 15B) and for FACS analysis. 

These experiments show that cp6475 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 116 

The following C.pneumoniae protein (PID 4376482) was expressed <SEQ ID 231; cp6482>: 

1 MDVELEALKR EFAHLKDQKP TSDQEITSLY QCLDHLEFVL LGLGQDKFLK 

51 ATEDEDVLFE SQKAIDAWNA LLTKARDVLG LGDIGAIYQT IEFLGAYLSK 

101 VNRRAFC IAS EIHFLKTAIR DLNAYYLLDF RWPLCKIEEF VDWGNDCVEI 

151 AKRKLCTFEK ETKELNESLL REEHAMEKCS IQDLQRKLSD IIIELHDVSL 

201 FCFSKTPSQE EYQKDCLYQS RLRYLLLLYE YTLLCKTSTD FQEQARAKEE 

251 FIREKFSLLE LEKGIKQTKE LEFAIAKSKL ERGCLVMRKY EAAAKHSLDS 

301 MFEEETVKSP RKDTE* 

The cp6482 nucleotide sequence <SEQ ID 232> is: 

1 ATGCTAGTAG AGTTAGAGGC TCTTAAAAGA GAGTTTGCGC ATTTAAAAGA 

51 CCAGAAGCCG AC AAGTGAC C AAGAGATCAC TTCACTTTAT CAATGTTTGG 

101 ATCATCTTGA ATTCGTTTTA CTCGGGCTGG GCCAGGACAA ATTTTTAAAG 

151 GCTACGGAAG ATGAAGATGT GCTTTTTGAG TCTCAAAAAG CAATCGATGC 

201 GTGGAATGCT TTATTGACAA AAGCCAGAGA TGTTTTAGGT CTTGGGGACA 

251 TAGGTGCTAT CTATCAGACT ATAGAATTCT TGGGTGCCTA TTTATCAAAA 

301 GTGAATCGGA GGGCTTTTTG TATTGCTTCG GAGATACATT TTCTAAAAAC 

351 AGCAATCCGA GATTTGAATG C AT ATTACC T GTTAGATTTT AGATGGCCTC 

401 TTTGCAAGAT AGAAGAGTTT GTGGATTGGG GGAATGATTG TGTTGAAATA 

451 GCAAAGAGGA AGCTATGCAC TTTTGAAAAA GAAACCAAGG AGCTCAATGA 

501 GAGCCTTCTT AGAGAGGAGC ATGCGATGGA GAAATGCTCG ATTCAAGATC 

551 TGCAAAGGAA ACTTAGCGAC ATTATTATTG AATTGCATGA TGTTTCTCTT 

601 TTTTGTTTTT CTAAGACTCC CAGTCAAGAG GAGTATCAAA AGGATTGTTT 

651 GTATCAATCA CGATTGAGGT ACTTATTGTT GCTGTATGAG TATACATTGT 

701 TATGTAAGAC ATCCACAGAT TTTCAAGAGC AGGCTAGGGC TAAAGAGGAG 

751 TTCATTAGGG AGAAATTCAG CCTTCTAGAG CTCGAAAAGG GAATAAAACA 

801 AACTAAAGAG CTTGAGTTTG CAATTGCTAA AAGTAAGTTA GAACGGGGCT 

851 GTTTAGTTAT GAGGAAGTAT GAAGCTGCCG CTAAACATAG TTTAGATTCT 

901 ATGTTCGAAG AAGAAACTGT GAAGTCGCCG CGGAAAGACA CAGAATAA 

The PSORT algorithm predicts cytoplasm (0.4607). 

The protein was expressed in E.coli and purified as a GST-fusion product (Figure 116A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
116B) and for FACS analysis. 

These experiments show that cp6482 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 



Example 117 

The following C.pneumoniae protein (PID 4376486) was expressed <SEQ ID 233; cp6486>: 

1 VWVALFILG IFFLSGSLAF LVHTSCGVLL GAALPILCIG LVLLAVALXV 

51 FLCHKHKTRQ DIjDYYDQDLD SLVIHKKEIP NDISELRVTF EKIiQNLFQFH 

101 TKDFSDIiSQE LQGKFINCME KWLTLEDEVT KFLIVRDRFL ETRRNFTTFG 

151 EQVKGIQSNI FDLHEEKSSL YLELYRLRKD LQVLLNFFLL PPGILKVDYD 

201 EIEAIKGLFI RLTSRLDKLD VKAQERKKFI NEMSREFKEV EKAFDIVDRA 

251 TKKLMDRAKK ESPARLFMGR TESLLEMKKN EEALKNQGLD PENLSHPELF 

301 SPYQQLLILN YUQSEIVLHH YEFLISGTVT SGLTLEECEN RMRAASTGLN 
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351 ALiLVRKLiQFR GAIKSAYFEK LTEIEKELRS LQDVIKSLEL ELIHKIKDIV 
401 TEET* 



The cp6486 nucleotide sequence <SEQ ID 234> is: 



l 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 



GTGGTGGTTG 
TCTTGCATTC 
TTCCCATACT 
TTCTTATGTC 
AGATTTAGAT 
CTGAGTTGCG 
ACGAAAGATT 
TTGCATGGAG 
TTGTTCGAGA 
GAACAGGTTA 
GTCTTCATTA 

TATTAAATTT 
GAAATTGAGG 
TAAGCTTGAT 
GTAGGGAATT 
ACAAAAAAGC 
CATGGGTAGA 
TTAAAAATCA 
AGTC CGTATC 
TCTGCATCAT 
CTCTTGAAGA 
GCCCTTCTGG 
TTTTGAAAAA 
TAATAAAGTC 
ACAGAAGAAA 



TCGCTTTATT 
CTTGTTCATA 
TTGCATAGGT 
ACAAACACAA 
TCTTTGGTGA 
GGTAACATTT 
TCTCTGATCT 
AAATGGCTAA 
TAGATTTTTA 
AAGGGATCCA 
TATTTAGAAT 
TTTTCTGCTC 
CTATCAAAGG 
GTGAAAGCTC 
TAAAGAAGTA 
TTATGGATAG 
ACTGAGTCTC 
GGGGCTAGAT 
AACAGCTTTT 
TATGAGTTCC 
ATGTGAAAAT 
TGCGTAAGCT 
CTCACAGAGA 
ATTGGAACTA 
CTTAG 



TATCCTTGGG 
CGTCTTGCGG 
CTTGTTTTAT 
GACTCGTCAA 
TTCATAAGAA 
GAAAAGTTGC 
AAGCCAAGAG 
CTTTAGAAGA 
GAAACCAGAA 
AAGCAATATT 
TGTATAGGCT 
CCCCCAGGTA 
TCTGTTTATA 
AGGAACGTAA 
GAGAAAGCTT 
AGCCAAGAAA 
TCTTAGAAAT 
CCTGAAAATC 
AATTTTGAAT 
TTATTTCTGG 
CGAATGAGGG 
CCAGTTCAGA 
TTGAAAAAGA 
GAACTGATCC 



ATTTTCTTTT 
AGTTCTTTTA 
TGGCTGTAGC 
GATTTAGATT 
AGAGATCCCC 
AAAATCTGTT 
CTTCAGGGTA 
CGAAGTGACT 
GAAATTTTAC 
TTTGATTTGC 
TAGGAAAGAC 
TACTCAAGGT 
AGATTAACCT 
GAAGTTCATT 
TTGATATTGT 
GAAAGTCCGG 
GAAAAAAAAT 
TTTCCCATCC 
TATTTAAATA 
AACAGTAACT 
CGGCTTCTAC 
GGTGCTATAA 
GTTACGATCA 
ATAAGATAAA 



TATCTGGTTC 
GGAGCGGCGC 
TCTTATTGTT 
ATTATGATCA 
AATGACATCT 
TCAGTTCCAT 
AATTTATCAA 
AAATTTCTTA 
CACTTTTGGA 
ATGAGGAAAA 
CTCCAAGTTC 
AGATTATGAT 
CTAGATTAGA 
AATGAAATGA 
CGATAGGGCA 
CACGTCTTTT 
GAAGAAGCCC 
TGAACTTTTT 
GCGAAATAGT 
TCTGGCCTAA 
TGGGTTGAAC 
AATCTGCGTA 
CTTCAAGACG 
AGATATAGTG 



The PSORT algorithm predicts inner membrane (0.7474). 

The protein was expressed in E.coli and purified as a GST-fusion product (Figure 117A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
1 17B) and for FACS analysis. 

These experiments show that cp6486 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 



Example 118 

The following C.pneumoniae protein (PID 4376526) was expressed <SEQ ID 235; cp6526>: 



1 


MSPFKKIVNR 


51 


HFISIiGCSRN 


101 


DEAKDYLDHL 


151 


ENIIiSAIESR 


201 


AFCIIPSIKG 


251 


TDRSSQIiESL 


301 


VDIPLQHIND 


351 


GETQEEFQEL 


401 


LKILSQIQKR 


451 


DPCIIVNEAK 


501 


* 



The cp6526 nucleotide sequence <SEQ ID 236> is: 

1 ATGAGTCCTT TTAAGAAAAT AGTAAATCGC TTACTATGCT ATATTTCTTT 

51 TCAAAAAGAA TCAAGAACTC TCCCAATCAT TATTAGAGAA CCTAGGATGA 

101 CAACAAAAAG TTTAGGATCT TTCAATTCAG TTATTTCCAA AAATAAAATT 

151 CATTTTATTA GTTTGGGATG CTCTCGGAAC CTTGTAGATA GCGAAGTCAT 

201 GCTAGGCATT CTTCTTAAGG C AGGTTAC G A GTCTACTAAT GAAATTGAAG 

251 ATGCTGACTA TTTAATTTTA AATACCTGTG CGTTTTTAAA AAGTGC T AG A 

301 GATGAAGCTA AAGATTATCT AG AC CATCT A ATTGATGTAA AAAAAGAGAA 
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351 CGCTAAAATT ATTGTAACTG GATGCATGAC TTCCAACCAC AAAGATGAGC 
401 TTAAACCCTG GATGTCACAC ATCCATTACC TACTAGGTTC TGGGGATGTT 
451 GAGAATATTC TTTCTGCTAT TGAGTCTCGT GAATCTGGAG AAAAAATCTC 
501 TGCAAAGAGT T AC AT TG AGA TGGGAGAAGT TCCAAGACAG CTTTCCACAC 
551 CAAAACACTA TGCCTATTTA AAAGTTGCTG AGGGCTGTAG AAAACGTTGT 
601 GCTTTTTGTA TTATTCCTTC CATTAAAGGA AAGCTCCGCA GCAAACCTCT 
651 GGATCAAATT CTTAAAGAAT TCCGCATCCT TGTAAACAAG AGTGTGAAAG 
701 AGATTATATT GATAGCTCAA GACCTAGGAG ATTATGGAAA GGATCTCTCT 
751 ACAGACCGCA GTTCGCAGCT AGAATCACTA TTACATGAGT TACTGAAAGA 
801 GCCTGGTGAT TATTGGCTGC GGATGTTGTA TTTATATCC T GATGAAGTGA 
851 GTGATGGCAT TATAGATCTT ATGCAATCTA ATCCCAAACT TCTTCCCTAT 
901 GTAGATATTC CCTTACAGCA CATTAACGAC CGTATTTTAA AGCAAATGCG 
951 AAGAACGACT TCTAGGGAGC AAATCCTAGG ATTCCTAGAA AAATTACGTG 
1001 CCAAGGTTCC TCAGGTCTAT ATCCGTTCTT CTGTTATTGT GGGTTTCCCC 
1051 GGTGAAACTC AGGAAGAATT CCAGGAGTTA GCTGATTTTA TTGGTGAGGG 
1101 TTGGATTGAT AATCTCGGAA TTTTCTTGTA CTCTCAAGAA GCGAATACCC 
1151 CGGCAGCAGA ACTCCCTGAC CAGATACCAG AAAAAGTTAA AGAATCGAGG 
1201 TTGAAAATTC TATCTCAAAT TCAGAAACGC AATGTGGATA AACATAATCA 
1251 GAAGCTCATT GGGGAAAAAA TAGAAGCAGT TATTGATAAC TATCATCCTG 
1301 AAACGAATCT TTTACTCACT GCAAGGTTCT ATGGACAAGC TCCTGAAGTG 
1351 GACC CTTGTA T TATTG T AAA TGAGGCGAAG CTTGTTTCTC ATTTTGGAGA 
1401 AAGATGCTTT ATAGAAATCA CAGGGACTGC TGGTTACGAC CTTGTAGGGC 
1451 GTGTTGTAAA AAAATCTCAG AACCAAGCTT TGCTAAAAAC TAGCAAAGCT 
1501 TAG 

The PSORT algorithm predicts cytoplasm (0.1296). 

The protein was expressed in E.coli and purified as a GST-fusion product (Figure 118A) and also as 
a his-tagged product. The recombinant proteins were used to immunise mice, whose sera were used 
in a Western blot (Figure 1 18B) and for FACS analysis. 

These experiments show that cp6526 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 119 

The following C.pneumoniae protein (PID 4376528) was expressed <SEQ ID 237; cp6528>: 

1 MKNNINNNEC YFKLD STVDG DLLAANLKTF DTQAQGISST ETFSVQGNAT 

51 FKDQVSATGL TSGTTYNIiNA QNFTSSQISI DFKNNRLSNC ALPKEDCDPV 

101 PANYVRS PEY FFCSKPLIGD FDFNSGESYL PLTGSEYTLY QSRNVNSIFR 

151 FIGWKQSTRE LTVGGNTAIQ FLAAGTYIVS FTVGKRWGWN NGWGGA I Y IN 

201 NGLGQVQCES TIYSGGGYAT IGTLGTSIYR ASVDVAPNPN DPNAS DRYRA 

251 GIFYIjSNGGS SAGIGNYSFS LLYYPDDRG* 

The cp6528 nucleotide sequence <SEQ ID 238> is: 

1 ATGAAAAACA ATATTAATAA TAATGAGTGC TATTTTAAAT TAGACTCAAC 

51 TGTAGATGGT GATTTGTTAG CAGCCAATCT CAAGACCTTT GATACACAGG 

101 CCCAAGGAAT CTCATCGACT GAAACATTTT CTGTTCAGGG GAATGCAACA 

151 TTTAAAGATC AAGTTTCAGC AACTGGATTA ACTTCAGGAA CTACTTATAA 

201 TTTAAATGCA CAAAACTTTA CTTCCTCCCA AATCTCTATA GATTTTAAAA 

251 ATAATCGTCT GAGTAATTGT GCATTGCCAA AAGAAGACTG CGATC CGGTG 

301 CCAGCGAATT ATGTTCGTTC TCCCGAATAT TTTTTCTGTT CCAAGCCTCT 

351 GATCGGAGAT TTTGATTTTA ACTCAGGGGA ATCTTATTTG CCTCTGACTG 

401 GTTCGGAATA TACTCTATAT CAGTCACGTA ATGTAAATAG TATATTTCGT 

451 TTTATAGGAT GGAAGCAAAG TACACGAGAA TTAACTGTAG GGGGAAATAC 

501 TGCGATACAA TTTCTTGCAG CAGGAACCTA TATCGTTTCA TTTACTGTTG 

551 GTAAACGGTG GGGATGGAAT AATGGTTGGG GAGGAGCCAT TTATATCAAT 

601 AATGGTTTAG GACAAGTCCA ATGTGAAAGC ACGATTTATA GTGGTGGAGG 

651 GTATGCAACA ATAGGTACAC TGGGGACCTC AATATATAGA GCCTCTGTAG 

701 ATGTAGCTCC TAATCCTAAT GATCCGAATG CTTCGGATCG CTATAGAGCG 

751 GGTATTTTCT ATCTCAGTAA CGGTGGTTCT AGTGCAGGTA TAGGGAATTA 

801 CTCCTTTTCT CTTCTCTATT ATCCGGACGA TAGAGGGTAG 
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The PSORT algorithm predicts cytoplasm (0.1668). 

The protein was expressed in E.coli and purified as a GST-fusion product (Figure 119A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
1 19B) and for FACS analysis. 

These experiments show that cp6528 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 120 

The following C.pnewnoniae protein (PID 4376627) was expressed <SEQ ID 239; cp6627>: 

1 MKCSPIiTLVP HIFLKNDCEC HRSCSLKIRT IARLILGLVL ALVSALSFVF 

51 LAAPISYAIG GTLALAAIVI LIITLWALL AKSKVLPIPN ELQKIIYNRY 

101 PKEVFYFVKT HSLTVNELKI FINCWKSGTD LPPNLHKKAE AFGIDILKSI 

151 DLTLFPEFEE ILLQNCPIjYW LSHFIDKTES VAGEIGIiNKT QKVYGLLGPL 

201 AFHKGYTTIF HSYTRPLLTL ISESQYKFLY SKASKNQWDS PSVKKTCEEI 

251 FKELPHNMIF RKDVQGISQF LFLFFSHGIT WEQAQMIQIiI NPDNWKMLCQ 

301 FDKAGGHCSM ATFGGFLNTE TNMFDPVSSN YEPTVNFMTW KEL.KVLLEKV 

351 KESPMHPASA LVQKICVNTT HHQNLLKRWQ FVRNTSSQWT SSLPQYAFHA 

401 QTYKLEKKIE SSLPIRSSL* 

The cp6627 nucleotide sequence <SEQ ID 240> is: 

1 ATGAAGTGTA GTCCTTTAAC ACTAGTTCCC CATATATTTT TAAAAAATGA 

51 CTGCGAATGT CATAGATCTT GTTCTTTAAA AATTAGGACA ATTGCCCGAC 

101 TCATTCTTGG GCTTGTTCTA GCTCTTGTTA GCGCACTTTC TTTTGTTTTC 

151 CTTGCTGCGC CGATTAGCTA TGCTATTGGA GGAACTTTAG CTTTAGCCGC 

201 TATCGTAATC TTGATTATAA CGCTAGTCGT AGCACTGCTA GCTAAATCAA 

251 AGGTTCTGCC CATCCCCAAC GAACTTCAGA AGATTATTTA CAATCGCTAT 

301 CCTAAAGAAG TCTTTTATTT CGTGAAAACA CACTCCCTGA CTGTTAACGA 

351 ATTAAAAATA TTTATTAATT GCTGGAAAAG C GGTAC AG AC CTGCCTCCGA 

401 ATTTACATAA AAAAGCAGAG GCTTTCGGGA TCGATATTCT AAAATCTATA 

451 GATTTAACCC TGTTTCCAGA GTTCGAAGAG ATTCTTCTTC AAAACTGCCC 

501 GTTATACTGG CTCTCCCATT TTATAGACAA AACTGAATCT GTTGCTGGGG 

551 AAATCGGATT AAATAAAACA CAAAAAGTTT ATGGTTTACT TGGGCCCTTA 

601 GCGTTTCATA AAGGATATAC AACTATTTTC CACTCTTATA CACGCCCTCT 

651 AC T AAC ATT A ATCTCAGAAT CACAGTATAA GTTCCTATAT AGTAAAGCGT 

701 CTAAGAATCA ATGGGATTCT CCTTCTGTGA AAAAAACCTG CGAAGAAATA 

751 TTCAAGGAAC TCCCCCACAA TATGATTTTC CGGAAGGATG TTCAAGGAAT 

801 CTCACAATTC TTATTTCTTT TCTTTTCTCA TGGTATCACT TGGGAACAGG 

851 CTCAGATGAT TCAACTTATA AATCCTGATA ATTGGAAAAT GTTGTGTCAG 

901 TTTGATAAAG CAGGAGGCCA CTGTTCCATG GCAACATTTG GAGGCTTTTT 

951 GAATACTGAA ACAAATATGT TCGATC CAGT ATCCTCTAAC TATGAACCTA 

1001 CAGTGAACTT CATGACGTGG AAAGAATTGA AGGTTTTACT AGAGAAAGTA 

1051 AAAGAAAGTC CTATGCACCC AGCGAGTGCT CTTGTTCAGA AGATATGCGT 

1101 AAATACAACG C AC CATC AAA ATCTGTTAAA ACGATGGCAA TTTGTTCGTA 

1151 ATACGAGTTC ACAATGGACA TCAAGCTTAC CTCAGTATGC TTTCCACGCC 

1201 CAAACCTACA AACTAGAGAA AAAAATAGAA AGCAGTCTCC CTATACGATC 

1251 TTCC CTATAA 

The PSORT algorithm predicts inner membrane (0.7198). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 120A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
120B) and for FACS analysis. 

These experiments show that cp6627 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 
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Example 121 



The following C.pneumoniae protein (PID 4376629) was expressed <SEQ ID 241; cp6629>: 



1 MSNITSPVIQ NNRSCNYYFE LKNSTTIHIV 

51 SYILSGALLG LGLLIALIGV ILGIKKITPM 

101 PKFVSDFVSE AKPNLKDLIS FIDLLNQLHS 

151 FEGIARLKNE VRTASLKRLE SAASSRPLFP 

201 AGSKWELHR VKKIGGSLEE DLSDYIKPEM 

251 LHTLVLARVL TRDVFQKLKY AALNGEWNLN 

301 SYKHLSQPSL QEDEFYNLLL CIFKHRYSWK 

351 LDHTGRPQDM EFASL IGTLY TQGLIHKESE 

401 STNIAMFliEN IiATHN STFRS LPPITVHPLK 

The cp6629 nucleotide sequence <SEQ ID 242> is: 



l 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 



ATGAGTAATA 
TTATTTTGAA 
TCTTACTCTG 
TCCTATATTC 
GATTGGTGTG 
AAGAACAAGT 
CCTAAATTTG 
TCTCATAAGT 
CATCTACAAA 
TTCGAGGGTA 
AAGACTTGAA 
AAATCTTACA 
GCAGGCAGCA 
CCTCGAAGAA 
ATTGGTTGAT 
CTACACACAT 
TCTTAAGTAT 
TAAATACTAT 
TCCTATAAAC 
C CTGCTCTTG 
TAATAAAAAC 
TTAGACCATA 
TACTCTCTAC 
CTTCATTGAC 
TCAAC C AAT A 
CTTTAGAAGC 
TCTCCCAACC 



TAACCTCGCC 
TTAAAGAATT 
CGGAGCTTTG 
TAAGTGGCGC 
ATTTTAGGAA 
ATTCCCCCAA 
TCTCTGATTT 
TTTATTGATC 
TTACAACGTA 
TCGCACGCTT 
AGCGCTGCTT 
AAAGGTATTT 
AGGTTGTAGA 
GACCTTAGTG 
TCCTTTAGAT 
TAGTTTTAGC 
GCAGCATTAA 
GAAACAGCAG 
ATCTATCTCA 
TGTATTTTTA 
AGTCCCGGCT 
CAGGACGACC 
ACACAAGGCC 
ACTCCTTAGT 
TAGCGATGTT 
TTACCACCTA 
TGAAGAAGAC 



AGTTATTCAA 
CAACCACTAT 
ATAGCTTTCT 
ATTGTTAGGA 
TAAAAAAAAT 
GAACTCGTAA 
TGTTTCAGAA 
TTCTAAATCA 
TCTGAAGAAC 
AAAAAATGAA 
CTTCCCGTCC 
CCATTTTTCT 
GCTCCATCGA 
ATTATATAAA 
TTTAGACCAA 
TAGAGTCTTA 
ATGGCGAGTG 
CTCTTTGCTA 
ACCCTCTCTT 
AGCATAGGTA 
GATTTATGGG 
CCAAGACATG 
TAATTCATAA 
TTAGATCAGT 
CCTTGAGAAT 
TAACAGTCCA 
GAGTCCTCCC 



ISAIIiLCGAJj 
ISSKEQVFPQ 
EVGSSTNYNV 
SLPKILQKVF 
LPTYWLIPLD 
HSDLNTMKQQ 
QMSLIKTVPA 
AFLSSLTLLS 
RSVFSQPEED 



AATAATCGCT 
TC ATATTGT T 
TGTGTGTAGC 
TTAGGATTAT 
CACGCCTATG 
ATAGAATCAG 
GCTAAACCAA 
ATTGCACTCT 
TACAACAGAA 
GTCCGTACTG 
CCTCTTCCCC 
GGTTAGGAGA 
GTTAAGAAAA 
ACCAGAGATG 
CAAATTCCTC 
ACTCGTGATG 
GAACCTGAAT 
AATATCATGC 
CAAGAGGATG 
CTCGTGGAAG 
AAAACCTCTG 
GAATTTGCCT 
AGAAAGCGAA 
TTAAAACGAT 
TTAGCAACTC 
TC C ACTCAAG 
TGCTGATAGG 



IAFIiCVAAPV 
ELVNRIRAHY 
SEELQQKIDT 
PFFWLGEFIS 
FRPTNSSILN 
LFAKYHAAYQ 
DLWENLCCLT 
LDQFKTIRRQ 
ESSLLIG* 



CTTGTAATTA 
ATCAGTGCCA 
AGCTCCTGTT 
TAATAGCCTT 
ATTTCATCAA 
GGCGCACTAT 
ATCTTAAAGA 
GAAGTTGGAT 
AATAGATACG 
CTTCTCTTAA 
TCTTTACCAA 
GTTTATTTCT 
TTGGAGGCAG 
CTTCCTACCT 
TATTCTAAAT 
TTTTTCAACA 
CATAGTGATC 
GGCGTATCAA 
AATTCTATAA 
CAGATGTCCT 
TTGCTTGACT 
CTCTAATTGG 
GCATTTCTTT 
CCGTCGTCAG 
ATAATTCCAC 
AGAAGCGTCT 
TTAG 



The PSORT algorithm predicts inner membrane (0.5776). 

The protein was expressed in E.coli and purified as a GST-fusion product (Figure 121 A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
121B) and for FACS analysis. 

These experiments show that cp6629 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 



Example 122 

The following C.pnewnoniae protein (PID 437 6732) was expressed <SEQ ID 243; cp6732>: 

1 MEMMSPFQQP EQCHFDWGS FLRPE SLTRA RSDFEEGRIV YEQMRWEDA 

51 AIRNLIKKQT EAGLIFFTDG EFRRYSWDFD FMWGFHGVDR RRDSNDPEIG 

101 VYLKDKISVS KHPFIEHFEF VKTFEKGNAK AKQTIPSPSQ FFHEMIFAPN 

151 LKNTRKFYPT NQELIDDIVF YYRQVIQDLY AAGCRNLQLD DCAWCRLLDI 

201 RAPSWYGVDS HDRLQEILEQ FLWIHNLVMK DRPEDLFVSIj HVCRGDYQAE 

251 FFSRRAYDSI eeplfaktdv dsyhyywald DKYSGGAEPL AYVSGEKHVC 

301 LGLISSNHSC I EDRDAWS R I YEAASYI PL ERI»SLSPQCG FASCEGDHRM 
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351 TEEEQWKK I A FVKEIAKEIW G* 

The cp6732 nucleotide sequence <SEQ ID 244> is: 



l 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 



ATGGAAATGA 
TGTGGGAAGT 
TTGAAGAAGG 
GCTATTCGTA 
TACTGATGGG 
GATTCCATGG 
GTGTATCTTA 
TTTCGAGTTT 
CGATTCCTTC 
CTGAAAAATA 
TATTGTCTTT 
GTCGTAATTT 
CGAGCGCCTT 
TTTAGAACAG 
AGGATCTTTT 
TTTTTCTCTA 
GACCGATGTG 
CAGGAGGTGC 
TTGGGATTGA 
GGTTTCTCGT 
CTTTGAGCCC 
ACTGAAGAAG 
AGAGATCTGG 



TGAGCCCATT 
TTCTTACGTC 
AAGAATTGTC 
ATCTCATAAA 
GAATTCCGTA 
CGTGGATCGT 
AAGATAAAAT 
GTCAAAACTT 
TCCATCACAA 
CTCGGAAGTT 
TATTATCGCC 
GCAGTTGGAC 
CTTGGTATGG 
TTTTTATGGA 
TGTAAGTCTG 
GACGAGCTTA 
GATAGTTATC 
TGAGCCTTTA 
TCTCCAGCAA 
ATTTATGAAG 
GCAATGTGGG 
AACAGTGGAA 
GGATAA 



CCAACAACCT 
CTGAAAGTCT 
TATGAGCAGA 
AAAGCAAACA 
GGTATAGTTG 
CGCAGGGACT 
CTCCGTATCA 
TTGAGAAGGG 
TTTTTCCATG 
TTATCCTACG 
AAGTCATCCA 
GATTGTGCTT 
TGTTGATTCT 
TCCATAATTT 
CATGTCTGTC 
TGATTCTATA 
ACTATTATTG 
GCTTACGTCT 
CCATTCTTGT 
CTGCGAGCTA 
TTTGCTTCTT 
GAAGATCGCC 



GAGCAATGTC 
TACACGAGCA 
TGCGAGTTGT 
GAAGCAGGTC 
GGATTTCGAC 
CTAATGACCC 
AAACATCCGT 
AAATGCAAAA 
AGATGATTTT 
AATCAAGAGC 
AGATCTTTAT 
GGTGTCGCCT 
CATGACAGGT 
AGTGATGAAG 
GTGGTGATTA 
GAGGAGCCTT 
GGCTCTTGAT 
CTGGAGAGAA 
ATTGAAGATC 
CATTCCCTTA 
GTGAGGGAGA 
TTTGTGAAAG 



ATTTTGATGT 
CGCTCTGATT 
CGAAGATGCT 
TTATCTTTTT 
TTTATGTGGG 
TGAAATTGGA 
TTATAGAACA 
GCAAAACAAA 
TGCTCCTAAT 
TAATTGATGA 
GCTGCAGGTT 
CTTGGATATA 
TGCAGGAAAT 
GATAGACCCG 
TCAGGCCGAG 
TATTTGCTAA 
GATAAGTATT 
ACACGTCTGC 
GAGATGCTGT 
GAGAGACTTT 
CCATAGAATG 
AGATTGCTAA 



The PSORT algorithm predicts cytoplasm (0.2196). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 122A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
122B) and for FACS analysis. 

These experiments show that cp6732 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 



Example 123 



The following Cpneumoniae protein (PID 437 673 8) was expressed <SEQ ID 245; cp6738>: 

1 VWLRFIiliLVS YDEKEKDVW VCNHSEPNIL GLPPEAVSQL IEELSDEGYS 

51 YUflWRCDliS GETTVQQRLIj LNADEGRSMT WISELPEGH PDIRNLQLAS 

101 ERIFVSREKE AADAYASGCK WAFDDEHLP WVSSHIAYAE EIREKQEQTM 

151 QGSLTEEQIiG ALL.CNTVSTE KNLAFAL.DAV IKQSVWRFRN PDLFAYEREA 

201 LEASVTDALV SYVSNLDMIP YTSSQGIVIE DSSIVRTSQE HTLIVNCAAF 

251 DKLASQIEFL CPSDVLPISG KDPLISDDED EEUSIPKVSSA ADSKDKT* 

The cp6738 nucleotide sequence <SEQ ID 246> is: 

1 GTGTGGCTGC GCTTTTTACT TTTAGTGTCC TATGATGAGA AGGAGAAAGA 

51 CGTAGTTGTC GTTTGTAATC ATTCTGAACC TAATATCCTC GGCCTGCCTC 

101 CTGAAGCAGT CTCTCAGCTT ATTGAAGAGC TTAGCGATGA AGGCTATAGC 

151 TATCTGAATG TAGTGCGTTG TGATCTCTCC GGGGAGACTA CGGTTCAACA 

201 ACGTCTGCTA TTGAATGCCG ATGAAGGGAG ATCTATGACG GTGGTGATCT 

251 CAGAGCTTCC TGAAGGGCAC CCCGATATTC GGAATTTGCA GTTGGCATC C 

301 GAAAGAATTT TTGTTTCTCG TGAAAAAGAA GCTGCTGATG CCTATGCTTC 

351 AG GATGT AAA GTGGTCGCTT TCGATGATGA GCATCTCCCT TGGGTCTCCA 

401 GTCATATTGC CTACGCGGAG GAG AT C AG AG AGAAACAAGA ACAAACAATG 

451 CAAGGGTCTT TAACTGAAGA GCAGTTAGGA GCACTCCTCT GCAACACAGT 

501 CTCCACAGAG AAAAATCTAG CCTTTGCTCT AGACGCCGTG ATAAAACAGT 

551 CTGTGTGGAG ATTCCGCAAT CCGGATCTTT TTGCTTATGA GAGAGAAGCT 

601 CTAGAGGCTT CAGTAACAGA TGCTTTAGTA TCTTACGTTT CAAATTTAGA 

651 CATGATACCG TACACAAGTT CTCAGGGCAT AGTCATAGAA GATAGTAGTA 

701 TCGTCCGTAC CTCTCAAGAG CATACACTCA TTGTGAACTG TGCAGCATTC 
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751 GATAAGTTAG CGAGCCAAAT AGAGTTCTTA TGCCCCAGTG ACGTGTTGCC 
801 CATTTCTGGT AAAGACCCTT TGATTTCTGA TGATGAGGAT GAGGAACTGA 
851 ATCCTAAAGT TTCATCTGCT GCAGACTCTA AAGATAAAAC CTAG 

The PSORT algorithm predicts cytoplasm (0.1587). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 123 A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
123B) and for FACS analysis. 

These experiments show that cp6738 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 124 

The following C.pneumoniae protein (PID 4376739) was expressed <SEQ ID 247; cp6739>: 

1 MTHCLHGWFS WRHHFVQAF NFSRPLYSRI THFALGVIKA IPIVGHLVMG 

51 VDWLISHCFE RGVSHPGFPS DIAPILKVEK IAGRDHISRI ENQLKSLRKT 

101 IEVEDLDKVH GQYQENPYAD MA S SEVLKIiD KGVHVSELGK AFSRVRNRIT * 

151 RSYSYAPTPQ LDSIAIVGID LVSPEEQENL VRLANEVIQL YPKSKTTLYL 

2 01 LIDFNKEWVG D I S S DKEKQL RSLGLHSEVQ CLSVLEPQGA EGEDTKHFDL 

251 MVGCYGKDSY TjREGKILQQA LGTSLGTVPW VNVMHTLPSR YRSRLSLPIN 

301 TEKDKTELYK EI SRTHHQLH TLGMGLGAQD SGLLLDRQRL HAPLSQGSHC 

351 HSYLADIiTHE ELKILLFSAF VDAKNISKKE LREVSLNFAN DTSVECGCAF 

401 YF* 

The cp6739 nucleotide sequence <SEQ ID 248> is: 

1 ATGACTCATT GCTTACATGG TTGGTTTTCT GTAGTTCGTC ATCACTTTGT 

51 GCAGGCGTTT AATTTCTCAC GTCCTTTATA TTCTCGAATT ACCCACTTCG 

101 CTTTAGGGGT GATTAAGGCC ATCCCCATTG TAGGGCATCT TGTTATGGGA 

151 GTCGATTGGT TGATCTC TC A TTGCTTCGAG AGGGGAGTCT CACACCCTGG 

201 GTTCCCTTCA GATATTGCTC CTATACTGAA AGTAGAAAAG ATCGCGGGCC 

251 GAGATCATAT TTCTAGAATC GAAAATCAGC TAAAGAGCCT TAGGAAAACT 

301 ATCGAGGTTG AAGATCTAGA TAAAGTCCAC GGGCAAT AT C AAGAGAATCC 

351 TTATGCAGAT ATGGCCTCTA GTGAGGTTCT TAAACTCGAT AAGGGAGTTC 

401 ATGTTAGCGA GCTTGGCAAA GCCTTTTCTA GAGTTCGCAA TCGCATCACC 

451 AGATCCTATA GTTATGCCCC TACTCCTCAG TTGGACTCTA TAGCTATTGT 

501 TGGTATAGAT CTCGTCAGTC CTGAAGAACA AGAGAATTTA GTACGCTTGG 

551 CGAATGAGGT CATTCAACTC TATCCCAAAT CAAAGACAAC TCTATATCTT 

601 CTTATCGATT TTAATAAGGA GTGGGTAGGG GATATCTCCT CTGATAAGGA 

651 AAAACAGCTC CGTTCTCTAG GTCTACATTC TGAAGTTCAG TGTCTTTCCG 

701 TCTTGGAACC TCAGGGTGCC GAGGGCGAAG ATACGAAACA CTTTGACCTT 

751 ATGGTCGGCT GTTATGGGAA GGATTCTTAC TTAAGGGAGG GTAAAATTTT 

801 ACAGCAGGCC CTAGGGACTT CGTTAGGTAC TGTTCCCTGG GTGAATGTTA 

851 TGC AC AC ATT GCCATCTAGG TATAGATCTC GGCTTTCCTT ACCTATAAAT 

901 ACCGAAAAGG ATAAGACAGA GCTTTATAAA GAGATTTCTC GTACACACCA 

951 TCAGTTGCAT ACTTTGGGAA TGGGACTTGG AGCCCAGGAT TCAGGATTGC 

1001 TCTTAGACCG GCAACGACTC CATGCTCCTT TATCTCAAGG GTCTCACTGC 

1051 CATTCCTATC TTGCAGATCT CACCCATGAA GAGCTGAAAA TTTTGTTATT 

1101 TTCAGCATTT GTGGATGCTA AGAACATAAG TAAGAAAGAG CTTCGTGAGG 

1151 TATCTCTAAA TTTTGCTAAC GATACTTCCG TAGAGTGTGG CTGCGCTTTT 

1201 TACTTTTAG 

The PSORT algorithm predicts inner membrane (0.2190). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 124A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
124B) and for FACS analysis. 
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These experiments show that cp6739 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 125 

The following C.pneumoniae protein (PID 4376741) was expressed <SEQ ED 249; cp6741>: 



l 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 



MASCLSAWFS 
IEWLVSRYLE 
VAPEDEDKVH 
QAYLQAPRPK 
YLTASGGRNA 
HGENDQGTLN 
DKEKALEYSB 
PLSEGHYCHS 
KTYLRQHFGF 
GYSHFNIFAF 
LASEGMLCGK 
VRKQKQEAAL 



I VRE HFYRAF 
SFVTKPTFVS 
GKIPVHPFGG 
LQKIYIIGND 
MDKKNRKLLS 
QIQEELEKSG 
LEKEQLYSRIi 
YLADLENPGIi 
FERMSRSDRN 
RSNSMCVEER 
EC YAVDVYT S 
DQDESEIYVC 



DFSLPFCARI 
DWSLLiKTEK 
IQPVEVLTLY 
MNPFEVDDFIi 
DCELNPKIAC 
EETPWIHVGQ 
VYVGERSSVIi 
QKTILAAFIjN 
WVWCDSWW 
RILNESSQEK 
GCANFMMEEV 
NQIiTAQQNFA 



The cp6741 nucleotide sequence <SEQ ID 250> is: 



1 


ATGGCTTCTT 


51 


TCGAGC CTTT 


101 


TATTAGGGGT 


151 


ATAGAGTGGC 


201 


ATTTGTCTCT 


251 


GCGATCACAT 


301 


GTGGCTCCTG 


351 


TTTCGGGGGA 


401 


AAGATGCAAC 


451 


CAGGCGTATT 


501 


AGGAAACGAT 


551 


GTCTCTGTAA 


601 


TATCTAACAG 


651 


GTTACTTAGT 


701 


AT CAGGGTG A 


751 


CATGGGGAGA 


801 


AAAGTCAGGG 


851 


CACAATCCTT 


901 


GATAAAGAGA 


951 


TTCTCGATTG 


1001 • 


TTGGAGATAG 


1051 


CCCTTATCTG 


1101 


TCCCGGGTTA 


1151 


TGAGCAGTAC 


1201 


AAAACTTACT 


1251 


TGATCGCAAT 


1301 


GGAAGGAGGA 


1351 


GGGTATTCGC 


1401 


AGAAGAACGT 


1451 


TGATTTTCTG 


1501 


TTGGCGTCTG 


1551 


CTATACGTCA 


1601 


AGCGAGAATC 


1651 


GTTAGAAAAC 


1701 


TTACGTTTGT 



GTTTATCTGC 
GATTTTTCTT 
CATCAAGGGG 
TCGTTTCTAG 
GATGTGGTGA 
TGCTCGTGTA 
AAGATGAGGA 
ATCCAACCTG 
GTTAGGGCTT 
TGCAAGCTCC 
ATGAATCCTT 
TGAAACTCAA 
CTTCTGGTGG 
GATTGCGAAC 
TGTAGTCAAA 
ATGATCAAGG 
GAGGAAACCC 
GTGGGATTTC 
AAGCTCTAGA 
GTATACGTAG 
TCGGTCAGGG 
AAGGGCATTA 
CAAAAAACAA 
CATACTGCAA 
TAAGGCAGCA 
GTGGTTGTCG 
GCCAAGCTTC 
ACTTCAATAT 
AGGATCTTAA 
TGAGGATTCA 
AAGGAATGCT 
GGATGCGCGA 
TAATCTGTGG 
AGAAACAAGA 
AATCAGCTGA 



CTGGTTTTCT 
TGCCGTTTTG 
ATCCCTGTTG 
GTATTTAGAG 
GTCTTCTGAA 
GTGGAGACTT 
TAAGGTCCAT 
TAGAAGTTCT 
GCCTTCTCTA 
ACGGCCAAAA 
TTGAAGTTGA 
AGACTCTATC 
TCGCAATGCT 
TAAACC CCAA 
CAAGCAACTT 
TACGTTGAAT 
CTTGGATTCA 
TCTCCATTTT 
GTACTCTGAA 
GAGAGCGCTC 
ATCTTGATGG 
TTGTCATTCC 
TTTTAGCGGC 
CCTATATCTC 
CTTTGGCTTT 
TTGTATGTGA 
CAACACTTTA 
TTTTGCCTTT 
ATGAAAGTTC 
GTATCTCAAG 
TTGTGGTAAA 
ACTTTATGAT 
AATAGAAAGC 
AGCTGCTTTG 
CGGCGCAACA 



TEFVLGVIKG 
VAGRDH I ARV 
PEVQDATL.GD 
HLARLCNETQ 
LDFNQGDWK 
KPLSQSLWDF 
SLGFGDSRSG 
PKELSSTILQ 
GTDWKEEPSF 
AFTMIFCEDS 
LTLERESNLW 
CS* 



ATAGTTCGTG 
TGCTCGTATT 
TGGGTCACAT 
AGTTTCGTGA 
AACAGAGAAA 
TGAAGAGGCA 
GGGAAGATTC 
CACTCTCTAT 
AAATTCGTAA 
CTGCAGAAGA 
CGACTTCTTG 
CTGACGCTAC 
ATGGACAAAA 
GATTGCTTGT 
GTGACTGTTG 
CAGATTCAGG 
TGTGGGGCAA 
CATCTTTGGA 
TTAGAAAAAG 
TTCGGTTCTT 
ACC CAAAACG 
TACCTTGCAG 
ATTTCTGAAT 
TAAATCTTAT 
TTTGAGAGGA 
TTCTTGGTGG 
TTATGGAGCT 
AGATCTAATA 
TCAAGAGAAA 
GAGATATCCG 
GAGTGCTATG 
GGAAGAAGTC 
ATGGTCTTTG 
GATCAAGACG 
GAACTTCGCT 



IPWGHIIVG 
VETLKRQRVA 
AFSKIRNRVR 
RLYPDATISL 
QATCDCWMVY 
SPFSSLEMKG 
ILMDPKRVHA 
PISLNLILNS 
QHFIMELECR 
VSQGDIRCLH 
NRKHGLWKRE 



AGCACTTTTA 
ACGGAATTTG 
TATTGTTGGG 
CCAAGCCGAC 
GTTGCTGGTC 
GAGAGTC GCT 
CTGTGCATCC 
CCCGAAGTTC 
TCGTGTAAGA 
TTTACATCAT 
CATCTAGCCC 
GATTTCTCTA 
AGAATCGGAA 
TTGGACTTTA 
GATGGTGTAT 
AAGAGTTAGA 
AAGCCTCTTT 
GATGAAGGGA 
AACAGCTATA 
AGTTTGGGGT 
GGTGCATGCT 
ACTTAGAAAA 
CCTAAGGAGT 
CTTAAATAGC 
TGAGCAGAAG 
GGTACCGACT 
AGAGTGTCGA 
GCATGTGTGT 
GCCTTTACCA 
CTGTTTGCAT 
CTGTCGATGT 
TTAACTTTGG 
GAAAAGAGAA 
AGAGCGAGAT 
TGTTCTTGA 



The PSORT algorithm predicts inner membrane (0.2869). 

The protein was expressed in Kcoli and purified as a GST-fiision product (Figure 125 A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
125B) and for FACS analysis. 
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These experiments show that cp6741 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 126 

The following C.pneumoniae protein (PID 4376742) was expressed <SEQ ID 251; cp6742>: 



1 LFVSNFIFFV VMPIPYISSW I STVRQHFVK 

51 KAIPIVGHIV MGMEWLVSSC VAGIITRSSF 

101 RVAE ILQRER GTITPENQDK VHGKFPVCPF 

151 DTVFSPIRTR VTRAYLQAPR PEIRTISIVG 

201 RLHPEALVCL YLTGLNRESQ MCDTTTAEKK 

251 DAG S PEN PEL WIGYYSREQQ HNIDGQYIQQ 

301 FYYPPNFTSY SHTRQSTDPT SPPRLPESEG 

351 LGLKPEDAGIi LMDPDRIYAP LSQGHYCHSY 

401 GNLSSEDLRP VAFNIARLPL ELDSLFFRLV 

451 DLDPDSMNIL TRRLQMSGYS YJLNIFSYKSR 

501 LILFEDPISA ADFRCLQLAA EGMVAKDLPS 

551 QAIEYRQWEA RVEDEAGEEA REPVIYSQDQ 

601 QAIWRFRSKG UJTMERKALG EEFLTAIFSY 

651 VI SFEELDRM VQVLPAEVPA DSGNDPTRPV 

The cp6742 nucleotide sequence <SEQ ID 252> is: 



AFDFSRPFCS 
TSDWQIVKT 
GRLKSEETLK 
SKLKTPQDFS 
QYLHNSGLDS 
CLGKSADPIP 
DKDSLYGQIiS 
LADIENEDLR 
AGQQEGRN IV 
KMIVKERQFF 
VADICASGCS 
LSSMLTTQQN 
LGSQERNENM 
PNPDSNPDSS 



RVTNFALGVI 
EKALGRDHIS 
LKPGEREGTL 
QFVSIiANETQ 
RIQCKDSKED 
WIHVTEDTKD 
RSYHHEYMLG 
TLVLSPFLDP 
TLAHGTPRPE 
GDRSEGKSFT 
CIQFSEMQSP 
FVFSLDAWK 
GKRTTEEHEV 
QNEGS* 



1 


TTGTTTGTTT 


51 


TTCTTCTTGG 


101 


TCTCTCGTCC 


151 


AAGGCCATCC 


201 


TTCTTCCTGT 


251 


TCGTTCAGAT 


301 


CGAGTGGCGG 


351 


TCAAGATAAG 


401 


AATC CGAGG A 


451 


GATACTGTAT 


501 


GGCCCCCCGA 


551 


AAACTCCTCA 


601 


AGACTGCATC 


651 


CGAATCTCAG 


701 


ATAACTCAGG 


751 


GACGCTGGCT 


801 


AGAGCAACAG 


851 


AGAGTGCAGA 


901 


TTTTATTACC 


951 


AGACCCAACA 


1001 


CCTTGTACGG 


1051 


TTGGGATTAA 


1101 


CTATGCTCCT 


1151 


TAGAAAATGA 


1201 


GGCAATCTTA 


1251 


ATTGCCATTA 


1301 


AAGAAGGGAG 


1351 


GATCTTGATC 


1401 


TGGATATAGC 


1451 


TAAAAGAACG 


1501 


TTGATCTTAT 


1551 


GCTAGCTGCA 


1601 


TTTGTGCCTC 


1651 


CAGGCTATTG 


1701 


AGAAGAAGCC 


1751 


TGCTCACTAC 


1801 


CAGGCGATCT 


1851 


GGCACTAGGC 


1901 


AGGAGCGTAA 


1951 


GTTATCAGCT 


2001 


AGTCCCTGCA 


2051 


ATAGTAACCC 



CTAATTTTAT 
ATTTCTACCG 
CTTTTGTTCT 
CTATTGTAGG 
GTTGCCGGGA 
TGTAAAGACT 
AGATATTGCA 
GTGCATGGGA 
AACTTTAAAA 
TTTCTC CGAT 
CCCGAAATAC 
AGATTTCTCG 
CTGAAGCGTT 
ATGTGCGATA 
TCTCGACTCT 
CTCCTGAAAA 
CATAATATAG 
TCCAATTCCT 
CACCAAACTT 
TCGCCACCAA 
ACAACTGAGT 
AACCAGAGGA 
CTATCCCAAG 
GGATCTACGA 
GTAGCGAGGA 
GAATTGGACT 
AAACATAGTT 
CTGACTCAAT 
TATTTGAACA 
TCAGTTCTTT 
TTGAGGATCC 
GAAGGTATGG 
TGGATGTTCC 
AATATAGACA 
AGAGAACCAG 
ACAACAGAAT 
GGAGATTCCG 
GAGGAGTTCT 
TGAGAATATG 
TCGAAGAGCT 
GATTCAGGCA 
TGATTCCTCG 



TTTTTTTGTT 
TTCGACAGCA 
AGGGTTACGA 
ACATATTGTC 
TTATTACTAG 
GAGAAGGCGT 
AAGAGAAAGG 
AGTTTCCTGT 
CTTAAGCCGG 
TCGCACGCGC 
GTACGATTTC 
CAATTTGTGA 
AGTTTGTCTG 
CAACTACTGC 
AGAATCCAGT 
TCCCGAACTT 
ACGGGCAGTA 
TGGATTCATG 
TACTTCATAC 
GACTCCCTGA 
CGATCGTATC 
TGCAGGACTC 
GGCATTATTG 
ACTTTAGTCC 
TCTTCGTCCT 
CGTTATTTTT 
ACCCTTGCCC 
GAACATTCTG 
TTTTCTCCTA 
GGAGATCGTT 
CATTAGTGCA 
TTGC TAAGG A 
TGCATTCAGT 
ATGGGAGGCA 
TAATTTATTC 
TTTGTATTTT 
TTCGAAAGGT 
TAACTGCGAT 
GGGAAAAGAA 
AGATCGCATG 
ATGATCCTAC 
CAAAATGAAG 



GTTATGCCAA 
TTTTGTTAAG 
ATTTTGCTTT 
ATGGGGATGG 
GTCCTCCTTT 
TAGGTCGAGA 
GGGACCATAA 
C TGTCCTTTT 
GAGAAAGAGA 
GTGACTCGTG 
TATTGTGGGT 
GTCTCGCGAA 
TATTTGACAG 
AGAGAAGAAG 
GCAAAGACAG 
TGGATTGGCT 
TATTCAGCAG 
TTACTGAAGA 
TCACATACAA 
AAGTGAGGGG 
ACCATGAGTA 
CTGATGGACC 
TCATTCCTAC 
TTTCGCCTTT 
GTAGCATTCA 
CCGCCTTGTT 
ACGGAACTCC 
ACCAGAAGAT 
TAAATCACGG 
CTGAAGGGAA 
GCAGATTTCC 
TCTCCCCAGC 
TTTCTGAGAT 
CGTGTCGAAG 
TCAGGATCAA 
CTCTAGATGC 
CTTCTTACTA 
ATTTTCCTAT 
CTACCGAAGA 
GTGCAAGTCC 
GCGTCCCGTT 
GCAGTTAG 



TTCCCTATAT 
GCGTTTGATT 
AGGGGTCATC 
AGTGGTTAGT 
ACCTCAGATG 
TCATATATCT 
CTCCTGAGAA 
GGTCGTTTAA 
GGGAACTTTA 
CGTACTTACA 
TCGAAACTTA 
TGAAACGCAG 
GCTTGAATCG 
CAGTAC CTAC 
TAAAGAAGAC 
ATTATTCACG 
TGTCTAGGGA 
CACAAAGGAT 
GACAATCTAC 
GATAAGGATT 
TATGCTTGGT 
CGGATAGAAT 
CTTGCGGATA 
CCTAGATCCT 
ATATCGCTAG 
GCGGGTCAGC 
TCGTCCAGAA 
TACAAATGTC 
AAAATGATTG 
GTCTTTCACA 
GTTGTTTGCA 
GTAGCAGATA 
GCAGAGTCCT 
ATGAAGCAGG 
TTGAGCAGCA 
TGTGGTAAAA 
TGGAAAGAAA 
TTAGGGAGTC 
ACATGAGGTC 
TCCCAGCCGA 
CCTAATCCAG 
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The PSORT algorithm predicts inner membrane (0.2338). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 126A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
126B) and for FACS analysis. 

These experiments show that cp6742 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 127 

The following C.pneumoniae protein (PID 4376744) was expressed <SEQ ID 253; cp6744>: 

1 VIQHLLNFAL EETPSISVQY QEQEKLSPCD HSPEIGKKKR WNKLESFSTY 

51 CSLFMSVKDH YKLNLGIQNS LSGWLLDPYR VCAPLSSPYS CPSYLLDLQN 

101 KEIiRRSIiLST FLDPKNLTSE TFRSVSINFG NSSFGQRWSE FliSRVLHDEK 

151 EKHVAWCND AKLLEEGLSP EALSLLiEEDI» RESGYSYliNI LSVSPEGVSK 

201 VQERQILRRD LQGRSFTVMI TDLPLGSEDI RSLQLASDRI LVSSSLDAAD 

251 ACASGCKVLV YENPNASWAQ ELENFYKQVE RRR* 

The cp6744 nucleotide sequence <SEQ ID 254> is: 

1 GTGATACAAC ATCTTCTAAA CTTTGCTCTA GAAGAGACCC CTTCCATTTC 
51 CGTGCAATAC CAAGAACAAG AGAAG CTCTC TCCGTGCGAT CATTCCCCAG 
101 AAATAGGTAA AAAGAAAAGA TGGAATAAGC TGGAATCCTT CTCCACGTAT 
151 TGTTCTCTGT TTATGTCTGT TAAGGATCAT TATAAGCTGA ATCTAGGAAT 
201 TCAGAATTCC CTGTCAGGGT GGCTTCTGGA TCCCTATAGG GTTTGCGCGC 
251 CTTTATCTTC ACCGTACTCG TGTCCTTCCT ATCTTTTAGA TTTGCAAAAC 
301 AAAGAGC TAC GTCGTTCCCT TCTGTCAACG TTTCTAG AC C CTAAAAATCT 
351 CACTAGCGAA ACATTCCGTT CTGTCTCTAT AAAC TTTGGC AACTCTTCGT 
401 TTGGACAGAG ATGGTCAGAG TTTCTATCTC GTGTTCTGCA CGACGAGAAA 
451 GAAAAGCACG TAGC TGTTGT TTGTAATGAT GCAAAACTTC TGGAAGAAGG 
501 ATTCTCCCCA GAGGCATTGT CTCTATTAGA AGAAGACTTA AGAGAATCAG 
551 GGTATTCGTA TCTAAACATT CTCTCGGTGA GCCCCGAAGG AGTCTCCAAG 
601 GTTCAGGAAC GTCAGATTCT AAGGCGAGAT CTCCAAGGAC GGTCCTTTAC 
651 TGTCATGATT ACAGATCTTC CTTTAGGTAG CGAAGAT AT C CGTAGTTTAC 
701 AATTAGCCTC GGATAGGATT TTAGTCTCCA GTTCTCTTGA TGCCGC GGAT 
751 GCATGTGCTT CGGGATGTAA AGTCTTAGTC TACGAAAATC CAAATGCATC 
801 CTGGGCTCAG GAATTGGAGA ACTTCTACAA ACAAGTTGAG AGAAGAAGGT 
851 AG 

The PSORT algorithm predicts cytoplasm (0.3833). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 127 A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
127B) and for FACS analysis. 

These experiments show that cp6744 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 



Example 128 



The following C.pneumoniae protein (PID 4376745) was expressed <SEQ ID 255; cp6745>: 

1 VACPSISSWF TWRQHFVNA FDFTHPVCSR ITNFALG I IK AIPVLGHIVM 

51 GIEWtilSVJIP RHTVRHGMFT SDVSSAIKVE QTRGHNCLAP LEAYLSSLRV 

101 PISQEDLGKV HGRTPEDPFV DITPTEIVQL LPDEELSTVD EALQGVRSRL 

151 TYAYRSVEKP MIQDLALVGF GLRDSADLIN FVRLANGVQN HYPHTKVKLY 

201 LAKNIjADVWD CEISEEEKGQ LRALGLDPKI ESISLTSAGL PSVPEVATVD 

251 FMITCYGKDQ EVQDP* 



WO 02/02606 



PCT/IB01/01445 



-153- 

The cp6745 nucleotide sequence <SEQ ID 256> is: 

1 GTGGCTTGTC CAAGTATTTC TTCTTGGTTT ACTGTCGTTC GACAGCATTT 

51 TGTAAACGCC TTTGATTTCA CCCATCCCGT TTGTTCTCGG ATTACAAATT 

101 TTGCTTTGGG GATCATTAAG GCAATTCCCG TATTAGGACA CATTGTCATG 

151 GGAATCGAGT GGTTGATTTC CTGGATTCCC AGACACACCG TTCGTCATGG 

201 AATGTTT AC T TCTGATGTCT CTAGTGCTAT TAAAGTAGAA CAAACACGGG 

251 GTCATAATTG TTTAGCTCCC CTAGAAGCCT ATTTAAGTAG CTTGAGAGTC 

301 CCCATTTCCC AAGAAGATCT AGGCAAAGTA CACGGGAGAA CCCCAGAAGA 

351 TCCCTTCGTA GATATCACAC CCACAGAAAT TGTCCAACTT CTCCCTGATG 

401 AAGAACTCTC TACTGTAGAT GAGGCACTGC AAGGCGTTCG TAGTAGGTTA 

451 ACCTATGCCT ATAGGTC CGT AG AG AAACC T ATGATTCAAG ATCTTGCTCT 

501 TGTGGGTTTT GGTCTC CGAG ATTCTGCGGA CCTCATAAAT TTCGTGCGTC 

551 TTGCTAATGG CGTGCAGAAT CACTATCCCC ATACTAAAGT GAAGCTCTAT 

601 TTAGCGAAGA ACTTGGCAGA TGTCTGGGAC TGTGAAATTT CTGAAGAGGA 

651 AAAAGGGCAA CTCCGAGCTC TAGGTTTAGA CCCTAAAATA GAGAGTATAT 

701 CCCTTACGAG TGCAGGTCTT CCTTCAGTGC CAGAAGTCGC T ACTGTC GAT 

751 TTTATGATTA CCTGTTACGG GAAAGATCAG GAAGTCCAAG ATCCCTAG 

The PSORT algorithm predicts inner membrane (0.2253). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 128A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
128B) and for FACS analysis. 

These experiments show that cp6745 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 129 

The following C.pneumoniae protein (PID 4376747) was expressed <SEQ ID 257; cp6747>: 

1 MMKQGVGQDA KELYTFIiSRG NEHYQPCLWF SLEEELGFLF DEKMLCAPLS 

51 EDHYCHSYLV DLVDQHL1KDI1 ILSMFLDPQN ISAGELLKVS INVGDSFSPL 

101 QQKDFLSMVL RDETGKNWV VFKGVL SL P A TQVCKLVEEL. NSKDYSYLNI 

151 FSCHGDSSPQ LLFRKELEGT SGRYFTVICA LYLGDTDMRS LQLASERIMV 

201 SREFDLVDAY AARCKLLKID HTNWRPGTFS RHADFADAVD VSAGFNSREF 

251 KLITQANQGI LESGELPLPS KTFWEGFLAF CDRVTVTRHF I PMLDAAIKQ 

301 AVWTHKHPSL IDKECEALDL KTQCLPSIVS YIjEYVTNSHE KTSKGPFIQK 

351 EIIADCSPLK EALFPGSDED VPSTSEDPSD DHPSDIjEDS* 

The cp6747 nucleotide sequence <SEQ ID 258> is: 



1 


ATGATGAAAC 


51 


ATCTCGTGGG 


101 


AGGAACTCGG 


151 


GAGGATCACT 


201 


AAAGGATTTA 


251 


GAGAACTCCT 


301 


CAACAGAAAG 


351 


CGTCGTCGTG 


401 


GCAAATTAGT 


451 


TTTTCTTGTC 


501 


AGAGGGAACT 


551 


GGGATACAGA 


601 


TCTAGAGAGT 


651 


GAAAATCGAT 


701 


ATTTCGCAGA 


751 


AAACTGATTA 


801 


GCTCCCTTCA 


851 


TGACTGTCAC 


901 


GCGGTATGGA 


951 


CCTAGACTTG 


1001 


ATGTCACAAA 


1051 


GAGATTATCG 
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1101 TGATGAAGAT GTTCCCTCTA CCTCTGAGGA TCCTTCAGAT GATCATCCTT 
1151 CGGATCTTGA AGACTCTTAA 

The PSORT algorithm predicts inner membrane (0.1447). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 129 A) and also as 
a his-tagged product. The recombinant proteins were used to immunise mice, whose sera were used 
in a Western blot (Figure 129B) and for FACS analysis. 

These experiments show that cp6747 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 130 

The following C.pneumoniae protein (PID 4376756) was expressed <SEQ ID 259; cp6756>: 

1 MASGIGGSSG LGKIPPKDNG DRSRSPSPKG ELGSHEISLP PQEHGEEGAS 

51 GSSHIHSSSS FLPEDQESQS SSSAASSPGF FSRVRSGVDR ALKSFGNFF S 

101 AESTSQARET RQAFVRLSKT ITADERRDVD SSSAAATEAR VAEDASVSGE 

151 NPSQGVPETS SGPEPQRLFS LPSVKKQSGL GRLVQTVRDR IVLPSGAPPT 

201 DSEPLSLYEL NLRLSSLRQE LSDIQSNDQL TPEEKAEATV TIQQLIQITE 

251 FQCGYMEATQ SSVSLAEARF KGVETSDEIN SLCSEI/TDPE LQELMSDGDS 

301 LQNliLDETAD DLEAALSHTR LSFSLDDNPT PIDNNPTLIS QEEPIYEEIG 

3 51 GAAD PQRTRE NWSTRLWNQI REALVSLLGM ILSILGSILH RLRIARHAAA 

401 EAVGRCCTCR GEECTSSEED SMSVGSPSEI DETERTGSPH DVPRRNGS PR 

451 EDSPLMNALV GWAHKHGAKT KESSESSTPE ISISAPIVRG WSQDSSVSFI 

501 VMEDDHIFYD VPRRKDG I YD VPSSPRWSPA RELEEDVFGD YEVPITSAEP 

551 SKDKNIYMTP RLATPAIYDL PSRPGSSGSS RSPSSDRVRS SSPNRRGVPL 

601 PPVPSPAMSE EGSIYEDMSG ASGAGESDYE DMSRSPSPRG DLDEPIYANT 

651 PEDNPFTQRN IDRILQERSG GASASPVEPI YDE I PWIHGR PPATLPRPEN 

701 TLTNVSLRVS PGFGPEVRAA LLSESVSAVM VEAESIVPPT EPGDGESEYI* 

751 EPLGGLVATT KILLQKGWPR GESNA* 

The cp6756 nucleotide sequence <SEQ ID 260> is: 

1 ATGGCATCAG GAATCGGAGG ATCTAGTGGA TTAGGAAAGA TTCCACCTAA 

51 AGATAATGGG GATAGAAGTC GATCGCCCTC TCCTAAGGGA GAACTTGGCA 

101 GCCACGAGAT TTCCCTGCCT CCTCAAGAAC ATGGAGAGGA AGGAGCTTCA 

151 GGATCTTCGC ATATACATAG CAGTTCCTCT TTTCTACCAG AAGATCAGGA 

201 GTCTCAGAGC TCTTCTTCGG CAGCTTCTAG CCCGGGATTT TTTTCTCGCG 

251 TACGTTCTGG GGTAGACAGG GCCTTAAAAT CATTTGGCAA CTTTTTTTCC 

301 GCAGAGTCTA CGAGTCAAGC GCGTGAAACG CGACAAGCTT TTGTTAGATT 

3 51 ATCAAAAACC ATCACCGCGG ATGAGAGACG GGATGTCGAT TCATCAAGTG 

401 CTGCTGCTAC AGAAGCCCGA GTGGCAGAGG ACGCGAGTGT TTCAGGCGAA 

451 AATCCTTCTC AGGGGGTTCC AGAAACCTCT TCTGGACCAG AACCTCAGCG 

501 TTTATTTTCT CTTCCTTCAG TAAAAAAACA GAGCGGTTTG GGTCGGTTGG 

551 TACAGACAGT TCGCGATCGC ATAGTACTTC CTAGTGGGGC TCCACCTACA 

601 GACAGCGAGC CTTTAAGTCT CTACGAGCTA AACCTCCGTT TGAGTAGTTT 

651 ACGTCAGGAG CTCTCTGACA TACAAAGTAA TGATCAGTTG ACTCCAGAGG 

701 AAAAAGCAGA AGCCACAGTT ACCATACAAC AGCTGATCCA AATTACAGAA 

751 TTCCAATGCG GCTATATGGA GGCAACACAA TCTTCGGTAT CTCTAGCAGA 

801 AGCTCGTTTT AAGGGGGTAG AAACTAGTGA TGAGATCAAT TCCCTCTGTT 

851 CAGAACTGAC AGATCCTGAG CTTCAAGAAC TCATGAGTGA TGGAGACTCT 

901 CTTCAAAACC TATTAGATGA GACTGCCGAC GATTTAGAAG CTGCTTTGTC 

951 CCATACTCGA TTGAGTTTTT CTTTAGACGA TAATCCAACT CCGATAGACA 

1001 ATAATCCAAC TCTGATTTCT CAAGAAGAGC CTATTTATGA GGAAATCGGA 

1051 GGAGCTGCAG ATCCTCAAAG AACTCGGGAA AACTGGTCTA CAAGATTATG 

1101 GAATCAGATT CGCGAGGCTC TGGTTTCTCT TTTAGGAATG ATTTTAAGCA 

1151 TTCTAGGGTC CATCTTGCAC AGGTTGCGTA TTGCTCGTCA TGCAGCTGCT 

1201 GAAGCAGTGG GTCGTTGTTG CACGTGCCGA GGAGAAGAGT GTACTTCTTC 

1251 TGAAGAGGAC TCGATGTCGG TGGGGTCTCC TTCAGAAATT GATGAAACTG 

1301 AAAGAACGGG CTCTCCGCAT GACGTTCCAC GCAGAAATGG AAGTCCACGT 

1351 GAAGATTCTC CATTGATGAA TGCCTTAGTA GGATGGGCAC ATAAGCACGG 

1401 TGCTAAAACC AAGGAGAGTT CAGAATCAAG TACCCCGGAA ATTTCGATTT 

1451 CTGCTCCCAT AGTGAGAGGT TGGAGTCAAG ACAGTTCCGT CAGTTTTATT 



WO 02/02606 



PCT/1B01/01445 



-155- 

1501 GTTATGGAAG ATGATCATAT TTTCTATGAT GTTCCTCGTA GAAAAGATGG 

1551 AATCTATGAC GTTCCTAGTT CCCCTAGATG GAGTCCTGCG CGAGAGTTGG 

1601 AAGAGGATGT TTTTGGAGAT TATGAAGTTC CTATAACCTC TGCTGAACCA 

1651 TCTAAAGACA AGAACATCTA CATGACACCT AGATTAGCAA CTCCTGCTAT 

1701 CTATGATCTT CCTTCACGTC CAGGATCGTC TGGAAGCTCA CGTTCTCCGT 

1751 CTTCAGATCG CGTACGAAGC AGCTCACCAA ATAGACGGGG TGTGCCTCTT 

1801 CCTCCAGTTC CTTCACCTGC TATGAGTGAG GAGGGGAGCA TTTATGAGGA 

1851 TATGAGCGGT GCTTCAGGTG CAGGTGAAAG TGATTATGAA GATATG AG C C 

1901 GTTCCCCCTC TCCTAGAGGC GACTTGGATG AACCCATATA TGCTAATACT 

1951 CCTGAAGATA ATC CATTTAC TCAGAGAAAT ATAGATAGAA TTTTACAGGA 

2001 GAGGTCAGGC GGTGCTTCCG C TTCTCC TGT AGAGCCTATT TATGATGAGA 

2051 TCCCATGGAT TCATGGCAGG CCCCCTGCTA CACTTCCAAG ACCCGAGAAT 

2101 ACATTGACTA ATGTTTCGCT TAGAGTGAGC CCAGGGTTTG GACCAGAAGT 

2151 AAGAGCCGCT TTGCTTAGCG AGAGCGTGAG TGCTGTTATG GTCGAAGCAG 

2201 AGAGTATTGT TCCTCCAACA GAGCCGGGGG ACGGAGAATC AGAATATCTA 

2251 GAGCCCTTAG GGGGACTTGT AGCTACAACG AAAATCTTAC TACAAAAAGG 

2301 ATGGCCTCGT GGAGAGTCGA ATGCTTAG 

The PSORT algorithm predicts inner membrane (0.3,994). 

The protein was expressed in E.coli and purified as a GST-fusion product (Figure 130A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
130B) and for FACS analysis. 

These experiments show that cp6756 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 131 

The following C.pneumoniae protein (PID 4376761) was expressed <SEQ ID 261; cp6761>: 

1 MTVAEVKGTF KLVCLGCRVN QYEVQAYRDQ LT I IjGYQEVL DSEIPADLCI 

51 INTCAVTASA ESSGRHAVRQ LCRONPTAHI WTGCIiGESD KEFFASLDRQ 

101 CTIiVSNKEKS RLIEKIFSYD TTFPEFKIHS FEGKSRAFIK VQDGCNSFCS 

151 YCIIPYLRGR SVSRPAEKIL AEIAGWDQG YREWIAGIN VGDYCDGERS 

201 LA SL I EQVDR IPGIERIRIS SIDPDDITED LHRAITSSRH TCPSSHLVIiQ 

251 SGSNSILKRM NRKY SRGDFL DCVEKFRASD PRYAFTTDVI VGFPGESDQD 

301 FEDTLRI I ED VGFIKVHSFP FSARRRTKAY TFDNQIPNQV IYERKKYLAE 

351 VAKRVGQKEM MKRLGETTEV LVEKVTGQVA TGHSPYFEKV SFPWGTVAI 

401 NTLVSVRLDR VEEEGLIGEI V* 

The cp6761 nucleotide sequence <SEQ ID 262> is: 

1 • ATGACGGTTG CGGAAGTCAA AGGAACATTT AAGCTGGTCT GTTTAGGCTG 

51 TCGGGTGAAT CAGTATGAGG TC CAAGC AT A TCGCGACCAG TTGACTATCT 

101 TAGGTTACCA AGAGGTCCTG GATTCTGAAA TCCCTGCAGA TTTATGCATA 

151 ATCAATACGT GTGCTGTCAC AGCTTCTGCT GAGAGTTCGG GTCGTCATGC 

201 TGTGCGTCAG TTATGTCGTC AGAACCCTAC AGCACATATT GTTGTCACAG 

251 GTTGTTTGGG GGAATCTGAC AAAGAGTTTT TTGCTTCTTT GGATCGGCAA 

301 TGCACACTTG TTTCCAATAA AGAAAAATCC CGACTTATAG AAAAAATTTT 

351 TTCCTATGAT ACGACCTTCC CTGAGTTCAA GATCCATAGT TTTGAGGGAA 

401 AGTCTCGAGC TTTT AT T AAA GTTCAAGATG GCTGTAATTC TTTTTGCTCG 

451 TAG TGC ATT A TTCCTTATTT GCGGGGGCGT TCGGTTTCTC GTCC TGCTG A 

501 GAAGATTTTA GCTGAAATCG CAGGGGTTGT AGACCAAGGA T AT CGCGAAG 

551 TTGTAATTGC AGGAATTAAT GTTGGAGATT ATTGCGATGG AGAGCGTTCA 

601 TT AG CCTCTT TGATTGAACA GGTGGACCGG ATTCCTGGAA TTGAGAGGAT 

651 TCGAATTTCC TCTATAGATC CTGATGATAT CACTGAAGAT CTGCAC CGTG 

701 CCATCACCTC ATCGCGTCAC ACTTGTCCTT CGTCACACCT TGTTCTTCAA 

751 TCGGGGTCGA ATTCAATTTT AAAGAGAATG AACCGGAAGT ATTCTCGCGG 

801 AGATTTTTTA GATTGTGTAG AGAAGTTCCG TGCTTCTGAT CCTCGCTATG 

851 CCTTTACTAC AGATGTGATT GTCGGATTTC C TGG AGAGAG TGATCAAGAT 

901 TTTGAAGATA CTTTGAGAAT TATTGAAGAT GTAGGCTTTA TTAAAGTGCA 

951 TAGTTTCCCT TTCAGTGCTC GTCGTCGTAC TAAGGCATAT ACTTTTGATA 

1001 ATCAGATTCC CAATCAGGTG ATCTATGAGA GGAAGAAGTA TCTTGCTGAG 

1051 GTTGCTAAGA GGGTAGGCCA GAAAGAGATG ATGAAGCGTT TAGGAGAGAC 
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1101 TACAGAGGTG CTTGTTGAGA AAGTAACGGG GCAGGTTGCT ACGGGTCACT 

1151 CTCCTTATTT TGAAAAGGTT TCTTTCCCTG TTGTAGGAAC GGTAGCTATC 

1201 AACACTCTAG TTTCTGTGCG TCTTGATAGG GTAGAGGAAG AAGGGC TG AT 

1251 TGGGGAGATT GTATGA 

The PSORT algorithm predicts inner membrane (0.1574). 

The protein was expressed in E.coli and purified as a GST-fusion product (Figure 131 A) and also as 
a his-tagged product. The recombinant proteins were used to immunise mice, whose sera were used 
in a Western blot (Figure 13 IB) and for FACS analysis. 

These experiments show that cp6761 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 132 

The following ^pneumoniae protein (PID 4376766) was expressed <SEQ ID 263 ; cp6766>: 



1 


MATSVPVTSS 


51 


IFPQVGLWAV 


101 


AQKEWTTQQD 


151 


NILKLEPLST 


201 


LLFLIEEQYY 


251 


REECSPEDAIi 


301 


SLQRKLPETA 


351 


CYESANQRLD 


401 


ILENESGFLC 


451 


DSMLSQFASR 


501 


YLTAVPQKMW 



TSVGEANSSN 
VLGFALGCLI* 
VXiGNEYWRSE 
TLSLLKKDCV 
SPDILKLIRY 
AQFDLIiliALE 
I DV ARYEAQ I 
NLFIAFSSSV 
SLYEYPLSYL 
LQSGQKVLNP 
LGALPLFESF 



ERFTERTSRM 
LSLAIVFAVS 
LISLFLRGDL 
HINIUjHLVR 
GDALQAT S PL 
NPDRRFLKDS 
QTFLSRYFQK 
PAMKRXiFDKY 
IDWAVLLDCV 
RDVTi SEQAAV 
PVFNRMKEFIj 



The cp6766 nucleotide sequence <SEQ ID 264> is: 



1 


ATGGCAACCT 


51 


CTCCTCCAAC 


101 


CTTTAGTCCT 


151 


ATTTTC CCAC 


201 


ATGTTTACTT 


251 


TAGGCAAGAC 


301 


GCGCAAAAGG 


351 


GCGTTCCGAG 


401 


TGATTGTTGA 


451 


AATATATTGA 


501 


AGATTGTGTC 


551 


TACTGGGAGT 


601 


CTACTCTTTT 


651 


GATTCGC TAC 


701 


CAGATTCAGG 


751 


AGAGAAGAAT 


801 


GGCGTTGGAA 


851 


ACATTTGGTC 


901 


AGCTTGCAAA 


951 


AGCACAAATA 


1001 


TAAACGCAAT 


1051 


TGTTATGAGA 


1101 


TTCTTCTGTT 


1151 


TACGGGTAGA 


1201 


ATCTTAGAAA 


1251 


ATCCTATTTG 


1301 


AAATCTCTCT 


1351 


GATTCTATGT 


1401 


ATTGAATCCT 


1451 


ATGGCTTGGC 


1501 


TATTTGACAG 


1551 


TGAATCTTTT 


1601 


TGGGAGACTA 



CTGTTCCTGT 
GAAAGATTTA 
AGGGGCTTTG 
AGGTCGGATT 
TTAAGCTTAG 
TTTAGAACCT 
AGTGGACTAC 
TTGATTTCCT 
TTCTAAGGAT 
AACTTGAGCC 
CACATCAATA 
GGATCTTAGT 
TGATAGAAGA 
GGAGATGCTT 
TTCCTTTAGT 
GTTCTCCTGA 
AATCCCGACA 
GTCTTCATTT 
GAAAGCTCCC 
CAAACATTTC 
GTCCTTAGAT 
GCGCAAATCA 
CCTGCTATGA 
TCGTAGGCAG 
ATGAGTCAGG 
ATAGATTGGG 
AGAAGATCAG 
TATCTCAATT 
AGAGATGTTT 
AGCACAGGGC 
CCGTTCCCCA 
CCTGTCTTTA 
G 



AACTTCATCT 
CTGAACGAAC 
AGCTGTTTAA 
GTGGGCTGTG 
CTATCGTTTT 
AGTCGAGAAG 
ACAACAAGAT 
TGTTCTTACG 
CGATCTTTAG 
CCTATCTACG 
TCATTTTACA 
CCTGAAGTCA 
GCAGTATTAC 
TACAAGCAAC 
GTAGACGCAG 
GGATGCTTTG 
GACGCTTCTT 
TTTGAGAAGT 
AGAGACAGCG 
TCTCTCGCTA 
TGGGGATATA 
AAGATTAGAC 
AGCGGCTCTT 
ATTCGTGAGC 
GTTCCTCTGC 
CTGTTTTGCT 
GCCGATTACA 
TGCGAGTCGT 
TAAGTGAACA 
GTGTCGTTTC 
AAGAATGTGG 
ATCGGATGAA 



YYAALVLGAL 
GLVLGKTLEP 
HESLIVDSKD 
QWNLLGVDLtS 
MDWADSGSFS 
FLTYIWSSSF 
LDIilNAMSIiD 
GSWRVDRRQ 
RGTEISLEDQ 
ML.VHGLAAQG 
GESLGD* 



ACTTCTGTAG 
ATCGCGAATG 
TTTTTATTGC 
GTCCTCGGGT 
TGCTGTCTCC 
CGACTCCTCC 
GTCTTAGGGA 
AGGGGATCTC 
AT ATTGATC A 
ACACTTTCGC 
TTTAGTGAGA 
CTGCGCACGC 
TCTCCTGATA 
GTCTCCTTTG 
ACGGGGTATT 
GCGCAATTCG 
AAAGGATTCT 
TTTTACATCG 
ATCGATGTCG 
TTTTCAGAAG 
ACTGTGCTGA 
AACCTATTTA 
TGACAAATAT 
AGATTCTTTC 
AGTTTGTATG 
AGACTGTGTT 



TTACAGTCTG 
GGCTGCGGTT 
AAGGATTGAA 
TTAGGAGCAT 
AGAATTTCTT 



SCLIFIAMIV 
SREATPPEIV 
RSLDIDQSLQ 
PEVTAHAEEL 
VDADGVF SCR 
FEKFLHRHIjE 
WGYNCAEGEK 
IREQILSNTE 
ADYTVCLQGL 
VSFQGLKALM 



GAGAGGCTAA 
TATTACGCAG 
TATGATTGTC 
TTGCTCTTGG 
GGTCTCGTTT 
AGAAATTGTT 
ATGAGTATTG 
CACGAATCTC 
GAGTTTACAA 
TGTTAAAGAA 
CAGTGGAACT 
CGAGGAACTT 
TTTTGAAATT 
ATGGATTGGG 
TAGCTGTCGC 
ATCTTCTTTT 
TTTCTTACCT 
CCATCTAGAG 
CCCGCTATGA 
CTCGATTTGA 
GGGAGAAAAA 
TTGCTTTTTC 
GGTTCTGTGG 
GAACACTGAA 
AATATCCTTT 
CGCGGTACCG 
GCAAGGCTTG 
GACAAAAAGT 
ATGCTTGTTC 
AGCTTTGATG 
TGCCTTTATT 
GGGGAATCTC 
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The PSORT algorithm predicts inner membrane (0.6158). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 132A) and also as 
a his-tagged product. The recombinant proteins were used to immunise mice, whose sera were used 
in a Western blot (Figure 132B) and for FACS analysis. 

These experiments show that cp6766 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 133 

The following C.pneumoniae protein (PID 4376804) was expressed <SEQ ID 265; cp6804>: 

1 MSNQLQPCIS LGCVSYINSF PLSLQLIKRN DIRCVLAPPA DLLtNLLi X EGK 

51 LDVALTSSLG AISHNLGYVP GFGIAANQRI LSVNLYAAPT FFNSPQPRIA 

101 ATLESRSSIG JbLKVIjCRHLW RIPTPHILRF ITTKVLRQTP ENYDGLLLIG 

151 DAALQHPVLP GFVTYDIiASG WYDLTKLPFV FALLLHSTSW KEHPLPNLAM 

201 EEALQQFESS PEEVLKEAHQ HTGLPPSLLQ EYYALCQYRL GEEHYESFEK 

251 FREYYGTLYQ QARli 

The cp6804 nucleotide sequence <SEQ ID 266> is: 

1 ATGTCTAACC AACTCCAGCC ATGTATAAGC TTAGGCTGCG TAAGTTATAT 

51 TAATTCCTTT CCGCTGTCCC TACAACTCAT AAAAAGAAAC G AT ATTCG CT 

101 GTGTTCTTGC TCCCCCTGCA GACCTCCTCA ACTTGCTAAT CGAAGGGAAA 

151 CTCGATGTTG CTTTGACCTC ATCCCTAGGA GCTATCTCTC ATAAC TTGGG 

201 GTATGTCCCC GGCTTTGGAA TTGCAGCAAA CCAACGT ATC CTCAGTGTAA 

251 AC CTCTATGC AGCTCCCACT TTCTTTAACT CACCGCAACC TCGGATTGCC 

301 GCAACTTTAG AAAGTCGCTC CTCTATAGGA CTCTTAAAAG TGCTTTGTCG 

351 TCATCTCTGG CGCATCCCAA CTCCTCATAT CCTAAGATTC ATAACTACAA 

401 AAGTACTCAG ACAAACCCCT GAAAATTATG ATGGCCTCCT CCTAATCGGA 

451 GATGCAGCGC TACAACATCC TGTACTTCCT GGATTTGTAA CCTATGACCT 

501 TGCCTCGGGG TGGTATGATC TTACAAAGCT ACCTTTTGTA TTTGCTCTTC 

551 TTCTACACAG CACCTCTTGG AAAGAACATC CCCTACCCAA CCTTGCGATG 

601 GAAGAAGCCC TCCAACAGTT CGAATCTTCA CCCGAAGAAG TCCTTAAAGA 

651 AGCTCATCAA CATACAGGTC TGCCCCCTTC TCTTCTTCAA GAATACTATG 

701 CCCTATGCCA GTACCGTCTA GGAGAAGAAC ACTACGAAAG CTTTGAAAAA 

751 TTCCGGGAAT ATTATGGAAC CCTCTACCAA CAAGCCCGAC TGTAA 

The PSORT algorithm predicts inner membrane (0.060). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 133A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
133B) and for FACS analysis. 

These experiments show that cp6804 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 



Example 134 

The following C.pneumoniae protein (PID 4376805) was expressed <SEQ ID 267; cp6805>: 

1 MSSLLSCGRI EPTRVTCSLK TYLEDTSQNQ LSTRLVRASV IFLCALLIIL 

51 VCVALSSLIP SIMALATSFT VMGLILFVMS LLGDVA1ISY LTYSTVTSYR 

101 QNKRAFEIHK PARSVYYEGV RHWDLGRSSL GTGEIPIVRT liFSPFQNHGL 

151 NHAIiAAKIFL FMEHFSPEPP NEPLVDWACL IRDFRPHVSS LCFVIEKQGS 

201 SLRTKEGNTI CEAFRSDYDA HFAMVDCYRL IHSKLIIEKM GLKNIDIIPS 

251 VMVREDYPSR PGEGYREGLL RMYGGKGAL* 



The cp6805 nucleotide sequence <SEQ ID 268> is: 
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1 ATGTCATCAC TACTGAGCTG CGGAAGAATA GAGCCGACTC GGGTTACCTG 

51 TAGCTTAAAG ACGTATCTTG AGGATACGAG TCAGAATCAG TTGAGCACAC 

101 GTCTAGTTCG GGCAAGTGTC ATCTTTTTAT GCGCATTGTT GATCATTTTG 

151 GTTTGTGTGG CCCTCTCTAG TTTGATTCCA AGCATTATGG CCTTGGCGAC 

2 01 CTCTTTTACG GTAATGGGGT TAATTCTTTT TGTGATGTCA CTTC TTGGTG 

251 ACGTTGCAAT TATAAGTTAT CTTACTTATA GCACTGTTAC GAGTTACCGG 

301 CAAAATAAGA GAGCTTTTGA GATTCACAAG CCCGCTCGCT C CGTTT ACT A 

351 CGAGGGGGTC CGCCATTGGG ATTTAGGACG ATCATCTTTA GGCACAGGCG 

401 AGATTCCTAT AGTAAGGACG TTATTCTCTC CATTTCAGAA CCATGGTCTT 

451 AACCATGCCT TAGC TGCTAA AATTTTCCTA TTTATGGAGC ATTTCAGCCC 

501 TGAGCCACCG AACGAGCCTT TGGTGGATTG GGCCTGTTTG ATTCGGGATT 

551 TTAGGCCTCA CGTCAGTTCT TTGTGCTTTG TTATTGAAAA ACAAGGGTCA 

601 TCGCTGAGGA CTAAGGAAGG CAATACGATT TGTGAGGCTT TCCGCTCTGA 

651 TTACGACGCC CATTTTGCTA TGGTAGATTG CTACCGGTTG ATCCACTCTA 

701 AGTTGATTAT AGAGAAAATG GGATTGAAGA ATATCGATAT CATTCCGAGT 

751 GTCATGGTTC GTGAAGATTA TCCTAGCCGT CCTGGGGAGG GCTATCGCGA 

801 AGGCCTATTA CGTATGTATG GTGGCAAGGG GGCTCTGTGA 

The PSORT algorithm predicts inner membrane (0.711). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 134A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
134B) and forFACS analysis. 

These experiments show that cp6805 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 135 

The following C.pneumoniae protein (PID 43 76813) was expressed <SEQ ID 269; cp6813>: 

1 MSGPSRTESS QVSVLSYVPR DKEIAPKKQF TIAKISTLAI LASLALGALV 

51 AG I SLT I VLG NPVFLALLIT TALFSWTFL VYHQMT SKVS SNWQKVLEQN 

101 FK PLGKAWQE KNVDCYSNEM QFYNNHLNPK FKVAI QTDAS QPFQPTFLTG 

151 LRVIEKNQST GIIFNPVGPT NLIDNTATNL STILYSTLKD KSWJDTCKQR 

201 EGG P AKGEDP FSPTEVRWK LPNEALDQTF NLNLSSAEKK SILPTFLGHV 

251 CGPKSEEIjPN QQEYYRQALL AYENCLKAAI ESHAAIVALP LFTSVYEVPP 

301 EEILPKEGTF YWDNQTQAFC KRALLDAIQN TALRYPQRSL LVILQDPFNT 

351 IESQSRSEE* 

The cp6813 nucleotide sequence <SEQ ID 270> is: 

1 ATGTCAGGAC CCTCACGTAC TGAGAGCTCT C AAGTTTC TG TACTATCCTA 

51 TGTGCCTCGG GATAAAGAAA TTGCTCCTAA AAAACAGTTT ACCATAGCAA 

101 AAATATCCAC TCTTGCAATC CTAGCTTCTT TAGCTTTAGG AGCTTTGGTG 

151 GCTGGAATCT CTTTAACGAT AGTATTAGGG AACCCTGTAT TTTTGGCTCT 

201 TCTCATTACC ACGGCCCTCT TCTCAGTTGT AACCTTCTTA GTCTACCACC 

251 AAATGACCTC AAAGGTATCT TCTAACTGGC AGAAAGTTCT AGAGCAAAAC 

301 TTCAAGCCTT TGGGAAAAGC GTGGCAAGAA AAAAACGTAG ACTGCTACTC 

351 AAACGAGATG CAATTTTACA ATAATCACCT GAACCCTAAG TTCAAGGTAG 

401 CGATACAAAC AGATGCGTCT CAACCATTTC AGCCTACTTT CTTAACTGGA 

451 CTTAGAGTGA TCGAAAAAAA TCAATCCACA GGGATCATCT TTAATCCCGT 

501 AGGCCCAACG AATCTGATCG ACAACACTGC AACGAACCTC TCTACTATCC 

551 TTTACTCCAC C CT AAAAG AT AAAAGCGTGT GGGATACATG CAAGCAACGC 

601 GAAGGGGGTC CCGCAAAAGG AGAAGACCCC TTTTCCCCTA CCGAAGTGAG 

651 AGTAGTAAAA CTTCCAAACG AAGCTCTAGA TCAAACGTTT AATCTAAATT 

701 TAAGCTC TGC AGAAAAGAAA AGTATTCTTC CGACCTTTTT AGGCCACGTA 

751 TGCGGCCCTA AATCTGAAGA GTTACCAAAT CAGCAAGAAT ATTATCGCCA 

801 AGCTTTACTA GCGTACGAGA ACTGCCTTAA AGCAGCTATA GAAAGTCATG 

851 CAGCAATCGT TGCTCTTCCT CTCTTTACTT CGGTCTATGA AGTGCCTCCA 

901 G AAG AG AT TC TTCCTAAAGA AGGCACTTTC TATTGGGACA ACCAAACTCA 

951 AGCGTTTTGC AAACGCGCTT TATTGGACGC TATTCAAAAT ACGGCCCTAC 

1001 GCTATCCTCA AAGATCTTTA CTTGTTATAC TCCAAGATCC TTTTAATACT 

1051 ATAGAATCAC AAAGTCGTTC TGAGGAGTAA 
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The PSORT algorithm predicts inner membrane (0.4291). 

The protein was expressed in Rcoll and purified as a GST-fusion product (Figure 135A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
135B) and for FACS analysis. 

These experiments show that cp6813 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 136 

The following ^pneumoniae protein (PID 4376844) was expressed <SEQ ID 271; cp6844>: 

1 MWRWLRFL I IFILGRAVFP LRASESFSWE TSTCLTVLGI PFIDIILTTN 

51 EDFVAQCGLQ IGTISSTNNA KIKEIFLIYK EKFPEAS I SF KRKEPLNLSQ 

101 SHLSDLGILC MRNGETYAEG MANKENGPAL KQPKDLRLVL RCPNQ PDTLL 

151 YSEKEAEKGI ETNTCLCNQG YTLLDGQLII, YGDSIEKFLK ETKRKNNHTL 

201 VDLCDSQWT TFLGRFWSL.L NYVQVLFIiSE DSAKILAGIP DIiAQATQLLS 

251 HTVPLLFIYT NDSIHIIEQG KESSFTYNQD LTEPILGFLF GYINRGSMEY 

301 CFNCAQSSLG ET* 

The cp6844 nucleotide sequence <SEQ ID 272> is: 

1 ATGTGGCGCG TTGTCCTCAG ATTCCTTATA ATTTTTATCT TGGGAAGAGC 

51 CGTCTTCCCT CTAAGAGCTT CAGAAAGCTT CTCCTGGGAA ACATCGACCT 

101 GTTTAACAGT GCTAGGGATT CCTTTCATAG ATATTATCCT CACAACGAAT 

151 GAGGACTTTG TTGCCCAGTG CGGCCTGCAA ATAGGAACCA TTTCTTCGAC 

201 TAATAACGCA AAAATAAAAG AAATTTTTTT GATATATAAG GAAAAATTTC 

251 CAGAAGCCTC TATCAGTTTC AAACGAAAAG AACCTCTAAA CCTTTCCCAA 

301 TCCCATCTCT CCGATTTAGG TATTTTATGT ATGCGTAACG GAGAAACTTA 

351 CGCTGAGGGA ATGGCAAATA AAGAAAACGG ACCCGCTCTA AAACAACCCA 

401 AGGATC TAAG ATTAGTTTTA CGTTGTCCTA ACCAACCAGA TACCCTGCTC 

451 TACTCGGAAA AAGAAGCAGA AAAGGGCATA GAAACAAATA CTTGCCTATG 

501 CAATCAGGGA TACACACTCC TGGATGGGCA ATTGATTCTC TACGGGGATA 

551 GTATAGAAAA GTT TCTGAAA GAGACCAAAA GAAAGAATAA CCACACGCTT 

601 GTTGATCTTT GTGACTCACA AGTCGTGACC ACGTTCCTCG GTCGCTTTTG 

651 GTCTCTTCTA AACTACGTTC AAGTTCTTTT CCTATCTGAA GACTCCGCTA 

701 AAATTCTTGC GGGCATCCCA GACCTAGCTC AAGCTACGCA ATTGCTTTCC 

751 CACACCGTAC CTTTGCTTTT TATTTATACC AACGATTCTA TTCACATCAT 

801 AGAACAAGGC AAAGAAAGTA GTTTTACCTA TAACCAAGAT TTAACAGAGC 

851 CCATTTTAGG ATTTCTCTTT GGTTACATAA ATCGCGGCTC TATGGAATAC 

901 TGCTTTAATT GTGCACAGTC TTCATTAGGA GAAACCTAA 

The PSORT algorithm predicts inner membrane (0.1786). 

The protein was expressed in Rcoli and purified as a GST-fusion product (Figure 136A) and also in 
his-tagged form. The recombinant proteins were used to immunise mice, whose sera were used in a 
Western blot (Figure 136B) and for FACS analysis. 

These experiments show that cp6844 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 137 

The following C.pneumoniae protein (PID 4377201) was expressed <SEQ ID 273; cp7201>: 

1 VLVGICPSLY PEHPRSFYYR VSGDIGSRFD DRGFVNSGVE TLPYSSGSFG 

51 IFWISFTDPT FNFAIVNTFM RTAGINEVSR PMTQDTET Si* IEMRDLSEQQ 

101 EANNTDSIiEQ EESLMGIVGH TVGGVSMTVT SSPNIFYRIQ TLLGLPETLA 

151 EAEENPTFPN STIDSLAEIM MNLVRISDAV SIFWIFPIVD TTYNGVLLAV 
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201 CIGFFGINGI CSTFLMLTNP RSRRDRWRNIi RIMVLCYRSL GSGMNUFDIiS 

251 NNVRMAARRH VT SCTVALYA MVTLFGWTVA IQDALQYGFP SVRDAFYRYC 

301 LRHRYCLTQR NEDSLQTTGT RFQVTRTHLE DQQMVASILN LSVFGLFFGF 

351 VGLMTTFGGL EISPSCRWDA ANNRTVG IF * 



The cp7201 nucleotide sequence <SEQ ID 274> is: 



1 GTGCTCGTTG GTATCTGTCC TTCTCTATAT CCAGAACATC CTCGCTCCTT 

51 TTATTATCGT GTTTCTGGAG ATATAGGCTC CCGATTCGAC GATAGAGGAT 

101 TTGTAAACTC TGGAGTCGAA ACCCTGCCAT ACTCTTCAGG CAGCTTTGGG 

151 ATTTTTTGGA TCTCGTTTAC GGATCCCACA TTTAATTTTG CTATCGTAAA 

201 TACCTTTATG CGAACTGCAG GGATCAATGA AGTCTCTAGA CCCATGACAC 

251 AAGATACAGA AACTTCATTG ATAGAAATGA GAGACCTAAG TGAACAACAA 

301 GAAGCGAATA ACACAGATTC TTTAGAGCAA GAAGAGAGCT TAATGGGTAT 

351 TGTAGGACAT ACTGTGGGAG GAGTTTCCAT GAC CGTGACC TCCAGTCCAA 

401 ATATCTTTTA TCGTATACAA ACACTTCTGG GACTGCCAGA GACTCTTGCA 

451 GAAGCTGAAG AAAATC C T AC CTTCCCAAAT TCTACTATAG ATAGCCTTGC 

501 AGAAATAATG ATGAACCTCG TAAGGATCTC TGATGCTGTC TCTATTTTCT 

551 GGATTTTTC C TATCGTAGAT ACTACATATA ATGGAGTTTT ATTAGCCGTC 

601 TGTATCGGCT TCTTCGGAAT CAATGGGATT TGTTCCACGT TCCTTATGCT 

651 TACGAATCCA CGCTCTCGTC GAGATAGATG GAGGAATTTA CGCATCATGG 

701 TTCTTTGCTA TCGTTCTTTG GGAAGCGGAA TGAATCTCTT TGATCTTAGC 

751 AATAATGTGC GCATGGCAGC ACGTAGGCAT GTGACATCAT GTACAGTAGC 

801 TCTCTATGCT ATGGTCACTC TATTTGGATG GACAGTAGCA ATACAAGATG 

851 CTTTGCAATA TGGTTTCCCT AGCGTTCGGG ATGCCTTCTA TAGATATTGC 

901 TTACGCCACA GATATTGCTT AACTCAAAGA AACGAAGACT CTCTGCAAAC 

951 TACAGGAACG CGCTTTCAGG TTACCCGTAC ACATCTAGAA GATCAACAGA 

1001 TGGTGGCTTC TATTTTGAAT TTGAGTGTTT TTGGGCTCTT TTTTGGATTC 

1051 GTAGGGC T AA TGACCACGTT TGGAGGATTA GAAATCTCAC CATCTTGTCG 

1101 GTGGGATGCA GCAAATAACC GAACGGTAGG TATTTTTTAG 

The PSORT algorithm predicts inner membrane (0.3102). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 137A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
137B) and for FACS analysis. 

These experiments show that cp7201 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 



Example 138 



The following C.pneumoniae protein (PID 4377251) was expressed <SEQ ID 275; cp7251>: 

1 MAP IHGSNAF VEDILHSHPS PQATYFSSTR AQKLHEFKDR HPVLTRIASV 

51 IIKIFKVLIG LIILPLGIYW LCQTLCTNSI LPSKNLLKIF KKQPNTKTLK 

101 TNYLHALQDY SSKNRVASMR RVPILQDNVIi IDTLEICLSQ APTNRWMLIS 

151 LGSDCSIiEEI ACKEIFDSWQ RFAKL IGANI LVYNYPGVMS STGSSSDKDL 

201 ASAHNICTRY LKDKEQGPGA KEIITYGYSL GGLIQAEALR DQKXVANDDT 

251 TWIAVKDRCP LFISPEGFHS CRRIGKLVAR LFGWGTKAVE RSQDLPCLEI 

301 FXiYPTDSLRR STVRQNKLLA PELTLAHAI K NSPYVQNKEF IEVRLSSDID 

351 PIDSKTRVAL ATPILKKLS* 

The cp725 1 nucleotide sequence <SEQ ID 276> is: 

1 ATGGCTCCAA TTCACGGAAG TAATGCGTTT GTTGAGGATA TTTTACATTC 

51 CCACCCTTCT CCACAAGCGA CTTATTTTTC TTCAACACGC GCCCAAAAAC 

101 TTCATGAGTT TAAAGACAGG CATCCCGTGC TTACACGGAT TGCTTCTGTA 

151 ATTATTAAAA TTTTTAAAGT TCTGATAGGG CTGATC ATC C TTCCCTTAGG 

201 AATCTACTGG CTATGTCAAA CGCTTTG T AC AAACTCGATT CTCCCTTCCA 

251 AGAATTTATT AAAAATTTTC AAGAAGCAAC CCAACACTAA AACCTTAAAA 

301 ACTAATTATT TGCATGCTTT GCAAGATTAT TCCTCGAAAA ACCGCGTTGC 

351 TTC CATGAGA CGAGTTCCTA TCCTCCAGGA TAATGTTCTC ATCGACACTT 

401 TGGAAATATG CCTTTCACAA GCACCTACGA ATCGTTGGAT GCTCATTTCT . 

451 TTAGGAAGTG ACTGTAGCTT GGAAGAAATC GCTTGTAAGG AGATCTTTGA 
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501 TTCTTGGCAA AGATTTGCCA AGTTGATAGG GGCCAATATA CTCGTTTATA 

551 ACTACCCCGG AGTCATGTCC AGCACAGGGA GCAGCAGCCT AAAGGACCTA 

601 GCATCAGCTC ATAATATTTG TACAAGATAC CTTAAAGATA AAGAACAGGG 

651 CCCTGGAGCA AAAGAAATCA TTACCTATGG GTACTCCCTA GGAGGTTTGA 

701 TACAAGCAGA AGCATTGCGA GACCAGAAGA TTGTTGCAAA CGATGATACT 

751 ACTTGGATAG CAGTCAAAGA TAGGTGTCCT CTCTTTATAT CTCCAGAAGG 

801 TTTCCACAGT TGCAGACGCA TAGGAAAGCT AGTAGCTCGT CTTTTTGGCT 

851 GGGGGACCAA AGCCGTAGAG AGAAGCCAAG ACCTTCCCTG CCTAGAAATT 

901 TTTCTCTATC CTACGGATTC CTTACGAAGA TCAACAGTCA GACAGAACAA 

951 GCTCTTAGCA CCTGAACTTA CTCTCGCTCA TGCGATAAAA AATAGTCCCT 

1001 ATGTTCAAAA TAAAGAATTT ATAGAAGTAC GATTATCGTC TGATATCGAT 

1051 CCCATCGACA GCAAAACAAG AGTGGCTCTT GCCACACCAA TTTTGAAAAA 

1101 GCTCTCTTAG 

The PSORT algorithm predicts inner membrane (0.4545). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 138 A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
138B) and for FACS analysis. 

These experiments show that cp7251 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 



Example 139 



The following C.pneumoniae protein (PID 4377288) was expressed <SEQ ID 277; cp7288>: 

1 MHMSNPISLF SPAELIAKYN LIPKTSPIYP RRTELIILEE NACQTRLTNV 

51 AQVLHPSSLF SMSKKILNPC GCSGGPLCWV ILNILAFIIT SVLFIILLPV 

101 NLIVAGLKLF MPLPPKKIVE DLSEPTTEET NEVIQPFIFA LQALLrFEDNK 

151 LRSFKIVEQS VGKAPLPNPF LNRLVAISPQ ESQEAMRKIP DL»C SQLKKVL 

201 KSLGVLTPEW KHMLKYFEGL KNEHDSNPDK KTFPILIKLL IEALTGKSSL 

251 PKTPSTKEKM QAALFIASSC KTCKPTWGEV ITRSLNRLYS IANEGDNQLL 

301 IWVQEFKERE LMSIQDGDDA EEYRFAAQQH GERYTEAIEQ VLRNE SAAKL 

3 51 QWHVINTMKF FHGKNLGLVT EHLQDTLGAL TLRQTTVDTH QGREDADLSA 

401 ALFLNKYLNS GNQLVNSVFK SMQKADPETK ALIREFALDI LYASLRLPQT 

451 SAHTEVFSTL LMDPETYEPN KAC I AYLLYV LKIIEL* 

The cp7288 nucleotide sequence <SEQ ID 278> is: 

1 ATGCATATGT CTAACCCCAT CTCTTTGTTT TCCCCTGCAG AGTTAATAGC 

51 AAAGTACAAT TTAATTCCAA AAACTTCGCC GATTTATCCT CGGAGGACGG 

101 AAC T T ATT AT CTTGGAAGAA AATGCGTGTC AAACACGCCT AACCAACGTG 

151 GCTCAGGTCC TACATCCTTC TAGCCTATTC AGTATGTCAA AAAAAATACT 

201 GAATCCCTGC GGGTGCTCTG GTGGTCCCTT ATGTTGGGTG ATTCTCAACA 

251 TCCTAGCATT TATTATTACT TCAGTACTGT TTATCATTCT TTTACCGGTG 

301 AATC TCATCG TAGCAGGTCT TCGTCTCTTC ATGCCTCTTC CCCCTAAAAA 

351 AATCGTAGAG GATTTAAGTG AAC C T AC TAC TGAAGAAACG AATGAGGTCA 

401 TTCAACCCTT CATTTTCGCT TTGCAAGCGT TGCTTTTTGA GGATAACAAA 

451 CTTCGCTCTT TTAAAATTGT TGAACAAAGT GTAGGCAAAG CACCCTTACC 

501 TAATCCCTTT TTAAATAGAC TAGTAGCAAT TTCGCCGCAA GAAAGCCAAG 

551 AAGCCATGCG GAAGATTCCG GATCTATGCT CACAACTGAA AAAAGTATTA 

601 AAGTCTCTAG GCGTGCTAAC TCCAGAATGG AAGCACATGC TGAAGTACTT 

651 TGAGGGACTG AAAAACGAAC ATGATAGTAA TCCTGATAAA AAGACGTTCC 

701 CAATATTGAT CAAGCTCCTC ATAGAAGCTC TTACTGGAAA GTCCTCTTTA 

751 CCCAAAACTC CTAGTACAAA GGAAAAAATG CAAGCGGCCT TATTTATTGC 

801 AAGTTCTTGC AAGACTTGTA AGCCGACTTG GGGAGAAGTC ATAACCAGAT 

851 CTCTTAACAG AC TC T AT AGT ATAGCTAATG AAGGAGACAA TCAGCTTCTG 

901 ATTTGGGTTC AAGAGTTTAA AGAACGAGAG CTGATGTCCA TCCAAGATGG 

951 TGATGATGCT GAAGAGTATC GGTTTGCGGC TCAGCAACAC GGTGAGCGTT 

1001 ACACAGAGGC AATAGAACAA GTTCTACGAA ACGAGTCAGC AGCCAAACTA 

1051 CAATGGCATG TGATCAACAC TATGAAATTC TTCCATGGGA AAAATCTCGG 

1101 TC T AG TTACA GAACACCTAC AAGATACTCT CGGCGCCCTA ACTTTACGTC 

1151 AAACTACAGT GGACACACAT CAAGGCAGAG AAGACGCTGA TTTGTCAGCT 

1201 GCTCTTTTCC taaataagta tttaaattct ggaaatcaac ttgttaatag 
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1251 CGTCTTTAAA TCCATGCAAA AAGCAGATCC AG AAAC C AAA GCTTTAATCC 

1301 GTGAGTTTGC TCTAGATATA TTATATGCAT CCTTACGGCT TCCTCAAACT 

1351 TCCGCTCATA CCGAGGTCTT TTCTACACTC TTAATGG AC C CAGAGACCTA 

1401 TGAACCTAAT AAAGCTTGTA TCGCCTACTT GCTCTATGTA TTAAAGATCA 

1451 TCGAACTATA A 

The PSORT algorithm predicts inner membrane (0.5989). 

The protein was expressed in E.coli and purified as a GST-fusion product (Figure 139 A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
139B) and for FACS analysis. 

These experiments show that cp7288 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 140 

The following C.pneumoniae protein (PID 4377359) was expressed <SEQ ID 279; cp7359>: 



15 



20 



l 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 



MPGSVSSPPL 
LWLSSALGA 
YDAAVKEEQY 
IiHGMTERL I A 
PWKEDIACIM 
RSRFFQTPKY 
CERAVALKET 
SVDYCKRLFV 
LPPPETGGSV 
FSYNEMCKEI 
EVSVLERPDI* 



SPVIVRERVP 
LPSLVLTVSG 
LSRIRELESE 
SLEIENQALV 
EQNLFLKPEC 
EYNSRNENED 
LPLPEAVYDT 
QLFEELCLrKL 
FVLLPKQENL 
SEGRIRFAED 
DVDSMWVWHP 



SSSGSDLIQP 
CIAIAVGLIG 
NREIRDRNRA 
AENILLKDWN 
IAMVKSLPLE 
GKVAAVCARL 
LVQEF PNLLT 
FTTGS PEDQA 
LWSQIEVLAT 
YETRHSEEFP 
PVPKGPL* 



HAVUCISILI 
LGILVTRL.IL 
VEDQCAHLSE 
ASLSRDFRAY 
TQRLFLYPKG 
KKEFFSAVLG 
AESLWKEWCF 
LVRLFSYYRN 
RYLKDTFVRN 
PSPLSEEGEG 



FALVTILGIV 
ST I RKVDAMG 
ENKDLRDPEY 
KQKFPLGALE 
FQSLVNRFAP 
ACSYEELGGI 
YSYPYLRPYL 
HIPAVLASFG 
SEWTGSFEMM 
EEFLPPCSEE 



25 The cp7359 nucleotide sequence <SEQ ID 280> is: 



30 



35 



40 



45 



50 



55 



1 


ATGCCAGGTT 


51 


AAGGGTCCCA 


101 


TAAAGATCTC 


151 


CTTGTAGTGT 


201 


GGTTTCTGGT 


251 


TTGTGACACG 


301 


TATGATGCTG 


351 


AGAGTCTGAA 


401 


AGTGTGCCCA 


451 


CTACATGGAA 


501 


AGCTCTCGTA 


551 


CTAGAGATTT 


601 


CCCTGGAAAG 


651 


ACCGGAATGT 


701 


TGTTTTTATA 


751 


CGGTCTCGCT 


801 


AAATGAGGAC 


851 


TCTTCAGTGC 


901 


TGTGAAAGAG 


951 


CTATGATACC 


1001 


TATGGAAAGA 


1051 


TCTGTGGATT 


1101 


CCTAAAGCTT 


1151 


TTTTCTCTTA 


1201 


TTGCCCCCGC 


1251 


AG AAAAC CTT 


1301 


AAGATACCTT 


1351 


TTTTCTTATA 


1401 


TGCTGAAGAC 


1451 


TCTCTGAAGA 


1501 


GAGGTTTCGG 


1551 


C TGGC ATCCG 



CTGTGTCATC 
TCCTCTTCAG 
CATCCTAATT 
TGTCTAGTGC 
TGTATTGCAA 
GCTGATTCTC 
CGGTCAAAGA 
AATAGAGAGA 
TTTATCCGAA 
TGACTGAAAG 
GCTGAGAACA 
CCGCGCATAT 
AAGATATTGC 
ATCGCGATGG 
TCCAAAAGGA 
TTTTCCAGAC 
GGAAAGGTAG 
TGTTTTAGGA 
CAGTAGCACT 
CTAGTTCAGG 
ATGGTGCTTC 
ACTGTAAGAG 
TTTACAACGG 
CTATAGGAAT 
CTGAGACAGG 
CTTTGGAGTC 
CGTGAGAAAC 
ACGAGATGTG 
TATGAAACGA 
AGGAGAGGGC 
TTCTTGAGCG 
CCGGTCCCTA 



ACCTCCTTTG 
GATCCGACCT 
TTTGCGCTTG 
TTTAGGAGCT 
TAGCTGTAGG 
TCTACGATCA 
AGAGCAGTAT 
TTAGAGATAG 
GAGAACAAGG 
GCTCATTGCG 
TTCTTCTCAA 
AAGCAAAAAT 
ATGTATCATG 
TTAAGTCTCT 
TTTCAGTCTT 
TCCAAAGTAT 
CCGCAGTGTG 
GCCTGTAGTT 
TAAAGAGACG 
AGTTCCCAAA 
TATTCCTATC 
GTTATTTGTA 
GATCTCCAGA 
CATATTCCCG 
GGGGTCTGTA 
AAATTGAGGT 
TCAGAATGGA 
TAAGGAGATC 
GGCATTCCGA 
GAAGAATTCC 
CCCAGATCTA 
AGGGACCTCT 



TCTCCTGTAA 
CATACAGCCT 
TGACAATTTT 
CTTCCTAGTT 
CCTGATTGGT 
GAAAAGTAGA 
TTGTCACGTA 
AAATCGTGCT 
ACCTTAGGGA 
AGC TT AG AAA 
AGACTGGAAT 
TTCCTCTTGG 
GAACAAAATC 
TCCATTAGAG 
TAGTTAATCG 
GAATATAACA 
CGCCCGTTTG 
ACGAAGAACT 
TTGCCATTGC 
TCTTCTTACT 
CCTACCTTCG 
CAACTTTTTG 
AGACCAAGCT 
CAGTCTTGGC 
TTTGTATTGC 
GCTGGCTACA 
CGGGCTCTTT 
TCCGAAGGAA 
AGAATTCCCT 
TTCCTCCTTG 
GATGTAGACT 
TTAA 



TTGTCCGTGA 
CATGCTGTTT 
AGGAATTGTT 
TAGTTTTGAC 
TTAGGGATTC 
TGCCATGGGT 
T C AG AG AATT 
GTCGAAGATC 
TCCCGAATAT 
TAGAGAATCA 
GCAAGCCTAT 
GGCATTAGAA 
TCTTTTTAAA 
ACGCAACGGC 
ATTTGCTCCG 
GTAGGAATGA 
AAAAAAGAAT 
AGGGGGCATT 
CTGAAGCTGT 
GCTGAGAGTT 
TCCCTATCTT 
AGGAACTCTG 
TTGGTTCGCC 
CTCATTTGGT 
TACCAAAACA 
AGGTATCTCA 
CGAGATGATG 
GGATTCGTTT 
CCTTCCCCTC 
CTCTGAAGAA 
CTATGTGGGT 



WO 02/02606 



PCT/1B01/01445 



-163- 

The PSORT algorithm predicts inner membrane (0.7453). 

The protein was expressed in Kcoli and purified as a GST-fusion product (Figure 140A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
140B) and for FACS analysis. 

These experiments show that cp7359 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 141 

The following C.pneumoniae protein (PID 4377374) was expressed <SEQ ID 281; cp7374>: 

1 MDKQSSGNSG CIWHPFTQSA LDSTPIKIVR GEGAYLYAES GTRYLDAISS 

51 WWCNIiHGHGH PYITKKLCEQ AQKLEHVIFA NFTHEPALEL VSKLAPLLPE 

101 GLERFFFSDN GSTSIEIAMK IAVQYYYNQN KAKSHFVGLS NAYHGDTFGA 

151 MSIAGTSPTT VPFHDLFLPS STIAAPYYGK EELAIAQAKT VFSESNIAAF 

201 IYEPIiLQGAG GMLMYNPEGL KEILKLAKHY GVLC X ADE I It TGFGRTGPLF 

251 ASEFTDIPPD IICLSKGLTG GYLPLALTVT TKEIHDAFVS QDRMKALLHG 

301 HTFTGNPLGC SAALASLDLT LSPECLQQRQ MIERCHQEFQ EAHGSLWQRC 

351 EVLGTVLALD YPAEATGYFS QYRDHLNRFF LERGVLLRPL GNTLYVLPPY 

401 CIQEEDLRII YSHLQDALCL QPQ* 

The cp7374 nucleotide sequence <SEQ ID 282> is: 

1 ATGGACAAGC AATCATCAGG GAATTCAGGG TGTATCTGGC ACCCCTTCAC 

51 TCAATCTGCA TTAGATTCTA CACCCATAAA GATTGT AAG G GGAGAAGGTG 

101 CTTACCTCTA TGCGGAATCA GGAACAAGAT ATC TTGATGC GATATCTTCA 

151 TGGTGGTGCA ACCTCCACGG TCATGGGCAT CCCTACATTA CAAAAAAATT 

201 ATGTGAGCAA GCACAGAAGT TAGAACATGT GATCTTCGCA AATTTCACCC 

251 ATGAACCGGC TCTAGAGCTC GTATCGAAAC TCGCTCCCCT CCTTCCTGAA 

3 01 GGTCTAGAAC GTTTCTTTTT CTCTGACAAC GGATCAACGT CTATCGAAAT 

351 AGCAATGAAA ATTGCTGTGC AATATTACTA CAATCAAAAC AAGGCTAAGA 

401 GCCATTTTGT TGGACTCAGC AATGCCTATC ACGGAGATAC ATTTGGAGCT 

451 ATGTCGATAG CTGGCACGAG CCCTACTACA GTTCCCTTTC ATGATCTTTT 

501 TCTTCCTTCC AGTACAATTG CTGCTCCCTA TTATGGCAAG GAAGAGCTTG 

551 CCATTGCCCA AGCAAAAACA GTCTTTTCTG AAAGCAATAT CGCAGCGTTT 

601 ATCTATGAGC CGCTATTGCA AGGTGCTGGA GGGATGTTAA TGTATAATCC 

651 CGAAGGCCTA AAGGAGATTC TCAAGCTTGC CAAGCATTAC GGGGTTCTCT 

7 01 GTATTGC TGA TGAAATTC TT ACTGGCTTTG GCCGTACGGG TCCACTGTTT 

751 GCTTCTGAAT TTACAGACAT TCCTCCTGAC ATTATCTGTC TTTCTAAAGG 

801 TCTTACAGGA GGCTATCTCC CTCTAGCCTT GACAGTAACC ACTAAAGAAA 

851 TTCATGATGC CTTTGTCTCC CAAGATCGGA TGAAGGCACT GCTTCATGGC 

901 CATACCTTCA CAGGAAATCC TTTAGGCTGT- AGTGCTGCCC TCGCTTCTTT 

951 GGATCTCACC CTATCTCCAG AATGCCTACA ACAAAGGCAA ATGATAGAAC 

1001 GGTGTCATCA AGAGTTTCAA GAAGCTCATG GTTCCCTATG GCAACGGTGT 

1051 G AGGTTC TGG GCACGGTACT CGCTCTAGAT TACCCTGCAG AAGCTACAGG 

1101 ATATTTTTCA CAATATAGAG ACCATCTCAA TCGCTTTTTC TTAGAACGTG 

1151 GAGTCCTTCT TCGTCCTTTA GGGAACACAC TGTATGTGCT GCCCCCCTAC 

1201 TGTATCCAAG AAGAAGATCT CCGGATTATT TATTCTCACC TACAGGATGC 

1251 CCTATGTCTA CAACCACAGT AA 

The PSORT algorithm predicts cytoplasm (0.2930). 

The protein was expressed in Kcoli and purified as a GST-fiision product (Figure 141 A) and also in 
his-tagged form. The recombinant proteins were used to immunise mice, whose sera were used in a 
Western blot (Figure 141B) and for FACS analysis. 

These experiments show that cp7374 is a surface-exposed and immunoaccessible protein, and that it 
is a useftjl immunogen. These properties are not evident from the sequence alone. 
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Example 142 

The following C.pneumoniae protein (PID 4377377) was expressed <SEQ ID 283; cp7377>: 

1 MREETVSWSIt EDIREIYHTP VFEL IHKANA ILRSNFLHSE LQTCYIilSIK 

51 TGGCVEDCAY CAQSSRYHTH VTPEPMMKIV DWERAKRAV ELGATRVCLG 

101 AAWRNAKDDR YFDRVLAMVK SITDLGAEVC CALGMLSEEQ AKKLYDAGLY 

151 AYNHNLDSSP EFYETIITTR SYEDRLNTLD WNKSGISTC CGGIVGMGES 

201 EEDRI KLLHV LATRDHI PES VPVNIiLWPID GTPLQDQPPI SFWEVLRT I A 

251 TARWFPRSM VRLAAGRAFL TVEQQTLCFL AGANSIFYGD KLLTVENNDI 

301 DEDAEMIKLL GLIPRPSFGI ERGNPCYANN S* 

The cp7377 nucleotide sequence <SEQ ID 284> is: 

1 ATGCGTGAAG AAACTGTATC CTGGTCATTA GAAGACATCC GCGAAATTTA 

51 TCACACTCCC GTATTTGAGC TGATTCACAA AGCCAATGCC AT ATTGC GTA 

101 GTAATTTCCT CCATTCAGAA CTGCAGACTT GCTATCTGAT TTCGATTAAA 

151 ACTGGTGGAT GCGTTGAAGA TTGCGCCTAC TGTGCCCAAT CTTCCCGCTA 

201 TCATACCCAC GTCACACCAG AACCTATGAT GAAAATTGTA GACGTTGTGG 

251 AAAGGGCAAA ACGTGCTGTA GAGCTAGGCG CCACTCGTGT GTGTCTTGGG 

301 GCTGCCTGGC GCAATGCTAA GGACGATCGA TACTTTGATA GAGTCCTCGC 

351 TATGGTGAAA AGTATCACAG ATCTCGGAGC CGAGGTTTGT TGTGCTTTAG 

401 GCATGCTCTC CGAAGAGCAA GCTAAAAAAC TGTATGATGC AGGACTTTAT 

451 GCCTACAATC ATAATTTAGA CTCTTCTCCG GAATTCTATG AAACTATAAT 

501 CACAACACGT TCTTATGAAG ATCGCCTCAA CACTCTTGAT GTAGTAAATA 

551 AATCTGGCAT TAGTACATGC TGCGGTGGTA TTGTAGGTAT GGGAGAATCT 

601 GAAGAAGACC GTATAAAGCT TCTTCATGTT CTTGCAACAA GAGATCATAT 

651 CCCAGAATCC GTACCTGTAA ATTTACTTTG GC CGATTG AC GGCACGCCTT 

701 TGCAAGACCA GCCTCCGATT TCTTTCTGGG AAGTCTTGCG AACCATAGCA 

751 ACGGCACGGG TTGTTTTCCC CAGATCCATG GTACGACTTG CTGCAGGACG 

801 CGCTTTCCTC ACAGTAGAAC AACAAACCTT ATGTTTTCTA GCCGGTGCCA 

851 ACTCCATATT CTATGGAGAT AAACTGTTGA CTGTAGAAAA CAATGATATA 

901 GATGAAGATG CTGAAATGAT CAAACTTTTA GGCTTAATCC CTCGCCCTTC 

951 ATTTGGAATA GAAAGAGGTA ACCCATGTTA TGCCAACAAT TCCTAA 

The PSORT algorithm predicts cytoplasm (0.2926). 

The protein was expressed in Rcoli and purified as a GST-fusion product (Figure 142A) and also in 
his-tagged form. The recombinant proteins were used to immunise mice, whose sera were used in a 
Western blot (Figure 142B) and for FACS analysis. 

These experiments show that cp7377 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 143 



The following C.pneumoniae protein (PID 4377407) was expressed <SEQ ID 285; cp7407>: 

1 MVCPNNSWFR MCGNFNCEWV EVTTTEETTR QSASDISEEA GSSGGAAPIT 

51 TQPTKITKVE KRVQFNTAQG DESTIHMIQE AGELVDSILS HRRTQGCTEY 

101 CYDSYATGCG QRCGSFGRLI CGTYKACCLD REDNQVAGLV HECEQTHGPI 

151 AVALAAKTMG LNLMEIiVEKN TILSEEQKNE FRQHCSEAKT QLYGTMQSLS 

201 QNFFLEGVNS I RERGLDDSL VQAVLSFIAT RSWEKTIESE EASGTSSASN 

251 STRIPACYIL NTSPLTTSRL SCGSRDARRP SSVGAEPQYV AKKYNDNGMA 

301 RQLGKIQVTN LKTGDFSALG PFGLL IVKML NSFLLSASQS TSSILKHTGG 

351 EICYTCPNFR DIWLLMLAI GYC PANTDET SWDIHMIDD PIMTIFYRLQ 

401 Y SYRTGKT S A SFLKKKPSLV RQESLDCPTP AESVPLMSSL EEEDENEDDD 

451 EDGNLAYQQR ILECSGHLQT LFLGIKINKE * 

The cp7407 nucleotide sequence <SEQ ID 286> is: 

1 ATGGTTTGCC CAAATAATTC TTGGTTCAGA ATGTGTGGAA ATTTCAACTG 

51 CGAATGGGTT GAAGTAACAA CAACAGAAGA AACAACGCGG CAATCGGCTT 

101 CAGATATAAG CGAAGAAGCT GGTTCGAGTG GAGGAGCTGC TCCTATAACT 

151 ACGCAACCTA CTAAAATTAC AAAAGTAGAG AAACGTGTCC AATTTAATAC 
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201 TGCTCAAGGT GATGAAAGTA CAATACACAT GATCCAAGAA GCAGGAGAAT 

251 TGGTAGACTC CATTCTATCA CATAGACGAA CGCAAGGATG TACAGAGTAT 

301 TGTTATGACA GTTACGCAAC TGGATGTGGT CAGCGTTGCG GATCTTTTGG 

351 AAG AC TC ATT TGTGGAACGT ATAAAGCGTG TTGCTTAGAC AGAGAGGATA 

401 ATCAGGTTGC TGGACTTGTC CATGAATGCG AACAGACCCA TGGTC CTATT 

451 GCCGTTGCTT TAGCTGCTAA AACTATGGGC CTCAACTTAA TGGAACTTGT 

501 AGAAAAAAAC ACTATTTTGT CTGAAGAACA GAAAAATGAA TTTAGACAGC 

551 ATTGCTCGGA AGCTAAAACC CAACTCTATG GAACGATGCA GAGCCTTTCT 

601 CAAAACTTTT TCCTTGAAGG AGTCAACAGC ATTAGAGAAC GCGGTCTAGA 

651 CGATTCACTA GTCCAAGCCG TGCTAAGCTT TATTGCTACA AGGTCTTGGG 

701 AAAAAACTAT AGAATCAGAG GAAGCCTCAG GAACATCTTC TGCTTCTAAT 

751 TCTACACGCA TTCCTGCGTG CTATATCTTA AATACGAGCC CCTTAACGAC 

801 GTCACGCCTA TCCTGTGGAT CAAGAGATGC GCGACGCCCA TCTTCAGTCG 

851 GTGCAGAGCC CCAGTACGTA GCAAAAAAAT ACAATGACAA TGGCATGGCC 

901 AGACAATTAG GAAAAATCCA AGTCACCAAT CTAAAAACAG GAGATTTTTC 

951 AGCTTTAGGT CCTTTTGGTC TCCTGATTGT GAAAATGCTG AATAGCTTTC 

1001 TCTTATCTGC ATCACAAAGC ACATCTTCTA TTCTAAAGCA CACAGGTGGA 

1051 GAAATATGTT ATACGTGCCC AAATTTTCGT GATATCGTCG TTTTATTGAT 

1101 GTTAGCGATT GGCTATTGCC CTGCAAATAC CGATGAGACA TCTGTCGTAG 

1151 ATATACACAT GATAGATGAT CCGATTATGA CCATCTTCTA TCGACTACAA 

1201 TACAGCTATA GAACAGGGAA AACTTCAGCA TCGTTTTTAA AAAAGAAACC 

1251 CTCATTAGTA AGACAGGAAA GTCTTGATTG TCCTACCCCT GCAGAATCTG 

1301 TCCCTCTCAT GTCAAGTCTC GAAGAAGAAG ATGAAAATGA AGATGATGAT 

1351 GAGGATGGGA ATTTGGCGTA TCAACAGCGT ATCCTTGAAT GCTCGGGTCA 

1401 TTTACAAACT CTATTTTTAG GGATAAAAAT AAACAAAGAA TAA 

The PSORT algorithm predicts inner membrane (0.1319). 

The protein was expressed in E.coli and purified as a GST-fusion product (Figure 143A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
143B) and for FACS analysis. 

These experiments show that cp7407 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 144 

The following C.pneumoniae protein (PID 4376432) was expressed <SEQ ID 287; cp6432>: 

1 MTRSTIESSD SLCSRSFSQK LSVQTLKNLC ESRLMKITSLr VIAFI/TLIVG 

51 GAL X ALAGGG VLSFPLGLIL GSVLVLFSSI YLVSCCKFFT LKEMTMTCSV 

101 KSKINIWFBK QRNKDIEKAL ENPDLFGENK RNVGNR S ARN QLEMILHETD 

151 GIIIiKRYMKG AKMYFYL* 

The cp6432 nucleotide sequence <SEQ ID 288> is: 

1 ATGACTAGAA GTACTATTGA AAGCAGTGAT TCGCTATGCT CAAGGTCTTT 
51 TTCTCAAAAA TTAAGTGTCC AGACATTAAA AAATCTCTGT GAAAGTAGAT 
101 TAATGAAGAT CACTTCTCTT GTGATTGCTT TCCTAACTCT AATTGTGGGG 
151 GGTGCTCTTA TAGCTTTAGC AGGAGGGGGG GTTCTTTCTT TCCCTCTTGG 
201 GCTAATCTTA GGAAGCGTAC TCGTTTTGTT TTCTTCTATC TATTTAGTCT 
251 CTTGTTGTAA ATTTTTTACT TTAAAAGAGA TGACAATGAC CTGTAGTGTC 
301 AAATCTAAAA TCAATATATG GTTTGAAAAG CAACGAAACA AAGACATCGA 
351 AAAGGCATTA GAGAATCCAG ATCTCTTTGG AGAAAATAAG AGAAATGTTG 
401 GAAATCGTTC GGCAAGAAAT CAACTAGAAA TGATCTTACA CGAGACTGAC 
451 GGAATTATTT TGAAAAGATA TATGAAAGGA GCTAAAATGT ACTTTTATTT 
501 ATGA 

The PSORT algorithm predicts inner membrane (0.5394). 

The protein was expressed in Kcoli and purified as a his-tagged product (Figure 144A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
144B) and for FACS analysis. 
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These experiments show that cp6432 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 145 

The following C. pneumoniae protein (PID 4376433) was expressed <SEQ ID 289; cp6433>: 

1 MNWVPKT I DH VDPESEIDIR KWSCYKLIK ECQPEFRSLI SELLGVIRCG 

51 liRLLKRSKYQ EQARTVSDED APLFCLTRSY YQDGYLTPLR AGPRDLINHY 

101 IHLRRRENPK HFFSPKHPCY YARLAFNESV CVYRELFDIE RLTKMYVEGD 

151 YSKEQEKNLQ AIIiSFVKTIiD EGKDFLIEHK DTDLIGRGFT DVFCT* 

The cp6433 nucleotide sequence <SEQ ID 290> is: 

1 ATGAATTGGG TTCCAAAAAC AATAGACCAT GTAGATCCAG AATCAGAGAT 

51 AGATATACGT AAAGTCGTCT CCTGCTATAA GTTGATAAAA GAATGTCAAC 

101 CTGAATTTCG ATCTCTTATA AGTGAATTAC TAGGAGTGAT TCGGTGTGGC 

151 TTAAGACTAT TAAAACGTTC TAAGTATCAA GAACAGGCTA GAACTGTATC 

201 TGATGAAGAT GCACCTCTTT TCTGCCTGAC TCGTTCTTAT TATCAAGATG 

251 GTTATCTCAC GCCATTAAGA GCAGGACCTC GTGATCTTAT AAATCACTAT 

301 ATACACTTGC GTCGCCGAGA GAATCCTAAG CATTTTTTCA GTCCTAAGCA 

351 TCCATGTTAT TATGCTCGAT TGGCTTTTAA TGAGTCAGTG TGTGTCTATA 

401 GAGAACTCTT TGATATAGAG CGACTTACAA AAATGTATGT CGAGGGTGAT 

451 TATTCTAAAG AACAAGAGAA AAACCTACAG GCTATTCTTA GTTTTGTGAA 

501 AACTCTAGAT GAAGGAAAGG ACTTTCTTAT TGAACATAAA GATACCGATC 

551 TCATTGGGAG AGGTTTTACT GATGTGTTCT GCACTTAA 

The PSORT algorithm predicts cytoplasm (0.4068). 

The protein was expressed in E.coli and purified as a his-tagged product (Figure 145A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
145B) and for FACS analysis. 

These experiments show that cp6433 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 



Example 146 

The following C.pneumoniae protein (PID 4376643) was expressed <SEQ ID 291; cp6643>: 

1 MGYLPVSATD VLFESPAAPL INSANTQNQK LIELKGKQQA ESSPRTITSV 
51 ILEVLLVIGC CLIVLSIiljAI RPALQFTLET GHPAAIAVIiA VSGTILLVAV 
101 IILFCFLAAV PFAAKKTYKY VKTVDDYASW HSHQQTPTLG TIFSGIVYAE 
151 SQAQL* 

The cp6643 nucleotide sequence <SEQ ID 292> is: 

1 ATGGGATATC TTCCAGTATC TGC TACGGAC GTTCTTTTTG AAAGTCCAGC 

51 CGCTCCCTTA ATCAATAGCG CAAACACACA AAATCAGAAA CTCATAGAAC 

101 TCAAGGGGAA GCAGCAAGCT GAGTCTTCTC CACGGACAAT CACTTCTGTC 

151 ATATTGGAAG TTCTCCTAGT GATCGGATGC TGCCTCATAG TTCTTAGTTT 

201 ATTGGCAATC CGCCCTGCTC TGCAATTCAC TCTAGAAACT GGACATCCAG 

251 CTGCCATTGC AGTCCTTGCT GTCTCAGGAA CAATTCTATT GGTGGCTGTT 

301 ATCATCTTGT TTTGCTTTCT AGCAGCTGTG CCATTCGCTG CTAAGAAAAC 

351 TTATAAATAT GTTAAG AC GG TTGATGACTA TGCTTCTTGG CATTCTCATC 

401 AGCAAACACC GACCCTAGGC ACTATCTTTT CAGGTATCGT CTATGCAGAA 

451 TCCCAGGCGC AATTATAG 



The PSORT algorithm predicts inner membrane (0.6859). 
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The protein was expressed in Kcoli and purified as a his-tagged product (Figure 146A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
146B) and forFACS analysis. 

These experiments show that cp6643 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 147 

The following C.pneumoniae protein (PID 4376722) was expressed <SEQ ID 293 ; cp6722>: 

1 VSSTLNGVFP SSLPEESADL FITNKEIVAL GEKGNVFLTH SIPMHIAAIT 

51 IIiVIVALAGI AIICIjGCYSQ SILLIAVGIV LTILTLLCLQ AliVGFIKFIR 

101 QLPQQIiHTTV QFIREKIRPE SSLQLVTNAQ RKTTQDTLKL YEELCDLSQK 

151 EFKLQSTLYQ KRFELSHKNE KTNQN* 

The cp6722 nucleotide sequence <SEQ ID 294> is: 

1 GTGTCTAGTA CTTTAAACGG GGTATTTCCC TCATCCCTTC CGGAAGAGTC 

51 TGCTGATTTA TTCATTACGA ATAAGGAGAT CGTAGCTTTG GGGGAGAAGG 

101 GCAATGTTTT TCTCACCCAC TCCATTCCTA TGCATATTGC TGCGATTACG 

151 ATCTTAGTGA TTGTAGCTCT TGCTGGAATC GCTATTATCT GTTTGGGTTG 

2 01 CTATAGCCAA AGCATTCTGT TGATTGCCGT TGGCATTGTT CTTACTATTT 
251 TGACTCTTCT CTGCCTACAA GCC TTGGT AG GATTTATTAA ATTCATCCGG 

3 01 CAGCTCCCTC AGCAGCTCCA TACGACAGTA CAATTTATCA GGGAGAAGAT 
351 TCGACCTGAA TCCTCTCTAC AGCTTGTAAC CAATGCACAG AGAAAAACCA 
401 CTCAAGATAC GCTAAAGTTA TACGAAGAAC TCTGCGACCT CTCACAAAAA 
451 GAGTTCAAAC TGCAATCAAC TCTTTATCAA AAACGTTTTG AGCTTTCTCA 
501 CAAGAATGAA AAGACAAATC AAAACTAG 

The PSORT algorithm predicts inner membrane (0.6668). 

The protein was expressed in Kcoli and purified as a his-tagged product (Figure 147A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
147B) and for FACS analysis. 

These experiments show that cp6722 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 148 

The following C.pneumoniae protein (PID 4377253) was expressed <SEQ ID 295 ; cp7253>: 

1 MSELAPCSTG LQMVPHTQVH HALDTRRVIL TIAACLSLIA GIVLVGLGAA 

51 AILPSLFGVI GGMIIilLFSS IAIiIYLYKKT REVDQIALEP LPEMISKDQS 

101 1 1 DFVKTRDY ASLEKKATFA YTHTHYYDGS MVFYREIPRF MLGSYLADRK 

151 DMDRQALF* 

The cp7253 nucleotide sequence <SEQ ID 296> is: 

1 ATGAGCGAGC TCGCCCCCTG CTCGACAGGA TTGCAGATGG TCCCCCATAC 

51 GCAGGTCCAT CATGCCCTTG ATACGCGGAG AGTCATTCTA ACGATAGCCG 

101 CCTGTCTGTC TTTAATTGCA GGAATCGTGT TGGTTGGCTT AGGTGCTGCA 

151 GCAATCCTGC CCTCGCTTTT TGGAGTCATT GGAGGAATGA TTCTTATTCT 

201 GTTTTCTTCG ATCGCCCTCA TTTATTTATA CAAGAAGACA AGGGAGGTGG 

251 ATCAGATTGC TCTGGAGCCT CTTCCTGAGA TGATTTC TAA AGATCAAAGC 

301 ATTATAGATT TTGTAAAGAC ACGAGACTAT G CATCTTT AG AAAAGAAAGC 

351 GACCTTTGCT TATACTCATA CTCATTATTA CGATGGAAGC ATGGTCTTCT 

401 ATAGGGAGAT CCCTAGATTT ATGTTAGGCT CTTATCTCGC GCTTCGCAAA 

451 GACATGGACC GCCAAGCTCT TTTTTGA 

The PSORT algorithm predicts inner membrane (0.5394). 
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The protein was expressed in E.coli and purified as a his-tagged product (Figure 148 A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
148B) and for FACS analysis. 

These experiments show that cp7253 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 149 

The following C.pneumoniae protein (PID 43762 64) was expressed <SEQ ID 297; cp6264>: 

1 VISGLLFLLV RREVPTVRSE EIPRGVSVTP SEEPALEKAQ KEPETKKILD 

51 RLPKELDQLD TYIQEVFACL ERLKDPKYED RGLIiTEAKEK LRVFDWEKD 

101 MMSEFLDIQR VLNEEAYYVE HCQDPLENIA YEIFSSQELR DYYCAGVCGY 

151 LPSGDARADR LKRSVKEVMD RFMRVTWKSW EASVMLDHSY GVARELFKKA 

201 VGVLEESVYK ILFKSYRDAF YECEKAKIQR DGRFKWL* 

The cp6264 nucleotide sequence <SEQ ID 298> is: 

1 GTGATTTCGG GACTTCTATT CCTTCTAGTA AGACGAGAGG TTCCGACAGT 

51 ACGTTCAGAG GAAATTCCCA GAGGGGTTTC TGTGACCCCT TCTGAAGAGC 

101 CTGCTCTAGA GAAGGCTCAA AAAGAAC CGG AGACAAAGAA AATTTTAGAT 

151 CGGTTGCCGA AGGAATTGGA TCAGTTAGAT ACGTATATTC AGGAAGTGTT 

2 01 TGCATGTTTA GAGAGGCTGA AGGATCCTAA GTACGAAGAT CGAGGTCTTT 

251 TAACAGAGGC GAAGGAGAAA CTTCGAGTTT TTGACGTTGT TGAGAAAGAT 

301 ATGATGTCAG AGTTTTTAGA CATACAACGA GTGTTGAATG AGGAAGCATA 

351 TTATGTAGAA CATTGTCAAG ATCCCCTAGA GAATATAGCC TACGAGATTT 

401 TCTCTTCCCA AGAGCTTCGT GATTACTACT GTGCAGGGGT GTGTGGGTAT 

451 TTGCCTTCTG GGGATGCTCG AGCGGATCGA TTAAAGAGAT CAGTTAAGGA 

501 GGTAATGGAT CGCTTTATGA GGGTGACCTG GAAATCTTGG GAGGCATCAG 

551 TCATGTTGGA TCATAGCTAT GGGGTAGCGC GAGAGTTATT CAAGAAGGCA 

601 GTAGGAGTAC TAGAGGAGAG TGTCTATAAA ATTCTGTTTA AGAGCTATAG 

651 AGATGCGTTT TATGAATGTG AGAAGGCAAA GATCCAGAGG GATGGGCGTT 

701 TCAAATGGTT ATAG 

The PSORT algorithm predicts cytoplasm (0.2817). 

The protein was expressed in E.coli and purified as a his-tagged product (Figure 149 A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
149B) and for FACS analysis. 

These experiments show that cp6264 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 150 

The following C.pneumoniae protein (pid 4376266) was expressed <SEQ ID 299; cp6266>: 

1 MLLL I SGALF LTLGIPGLSA AISFGLGIGL SALGGVLMIS GLLCLIiVKRE 

51 IPTVRPEEIP EGVSIiAPSEE PALQAAQKTL AQLPKELDQL DTD IQEVFAC 

101 LRKJLKDSKYE SRSFLNDAKK ELRVFDFWE DTLSEIFELR QIVAQEGWDL 

151 NFLINGGRSL MMTAESESLD LFHVSKRLGY LPSGDVRGEG LKKSAKEIVA 

201 RLMSLHCEIH KVAVAFDRNS YAMAEKAFAK ALGALEESVY RSLTQSYRDK 

251 FLESERAKIP WNGHITWLRD DAKSGCAEKK LGMPRNVGRN LGKQSFG* 

The cp6266 nucleotide sequence <SEQ ID 300> is: 

1 ATGCTCTTAC TGATTTCAGG AGCTCTCTTT CTGACGTTAG GGATTCCAGG 

51 ATTGAGTGCA GCAATTTCTT TTGGATTAGG CATCGGTCTC TC C G C ATT AG 

101 GAGGAGTGCT GATGATTTCG GGACTACTAT GTCTTTTAGT AAAACGAGAG 

151 ATTCCGACAG TACGACCAGA AGAAATTCCT GAAGGGGTTT CGCTGGCTCC 
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201 TTCTGAGGAG CCAGCTCTAC AGGCAGCTCA GAAGACTTTA GCTCAGCTGC 

251 CTAAGGAATT GGATCAGTTA GATACAGATA TTCAGGAAGT GTTCGCATGT 

301 TTAAGAAAGC TGAAAGATTC TAAGTATGAA AGTCGAAGTT TTTTAAACGA 

351 TGCTAAGAAG GAGCTTCGAG TTTTTGACTT TGTGGTTGAG GATACC CTCT 

401 CGGAGATTTT CGAGTTGCGG CAGATTGTGG CTCAAGAGGG ATGGGATTTA 

451 AACTTTTTGA TCAATGGGGG ACGAAGCCTC ATGATGACTG CAGAATCTGA 

501 ATCGC TTG AT TTGTTTCATG TATCGAAGCG GCTAGGGTAT TT AC CTTCTG 

551 GGGATGTTCG AGGGGAGGGG TTAAAGAAAT CTGCGAAGGA GATAGTCGCT 

601 CGTTTGATGA GCTTGCATTG CGAGATTCAC AAGGTGGCGG TAGCGTTTGA 

651 TAGGAATTCC TATGCGATGG CAGAAAAGGC GTTTGCGAAA GCGTTGGGAG 

701 CTTTAGAAGA GAGTGTGTAT CGGAGTCTGA CG CAGAGTT A TAGAGATAAA 

751 TTTTTGGAGA G CG AGAGGGC G AAGATCC C A TGGAATGGGC ATATAACCTG 

801 GTTAAGAGAT GATGCGAAGA GTGGGTGTGC TGAAAAGAAG CTCGGGATGC 

851 CGAGGAACGT TGGAAGAAAT TTAGGAAAGC AGTCTTTTGG GTAG 



The PSORT algorithm predicts inner membrane (0.3590). 

The protein was expressed in E.coli and purified as a his-tag product (Figure 150A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
150) and for FACS analysis. 

These experiments show that cp6266 is a surface-exposed and immunoaccessible protein and that 
they it is a useful immunogen. These properties are not evident from the sequence alone. 

Example 151 

The following C.pneumoniae protein (pid 4376895) was expressed <SEQ ID 301; cp6895>: 



1 ATGAAGATTA AAAAATCTTT TCAATACAGT TTATGCCAAG CAAAGAGATT 

51 TCAGAACATG CTGCCAAACC ACTTTGATCC ATGTTTGCAG CCAGTGAATT 

101 TACAACTCAA ACAAGACAGA TTGGCATACG GGGAGCTCAT CATATTGCTA 

151 TCTAAATATC AACAAAAGAC CTTTTCCTCT TTGTTGAAGG AAGAAACATG 

201 TTCTCTTAAT CGTGCGAAGC AGCACTTATT GTATAAGATT TTGAGAGATT 

251 TTAATACTAT GCAGC ATC TA AGGTCCCTCG GATTAAATGG TTGGGGAGAG 

301 ATCCCTATGA GTCCTTGCCT CTAA 



The PSORT algorithm predicts cytoplasm (0.3264). 

The protein was expressed in E.coli and purified as a his-tag product (Figure 151 A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
15 IB) and for FACS analysis. 

These experiments show that cp6895 is a surface-exposed and immunoaccessible protein and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 152 and 
Example 153 

The following C.pneumoniae protein (pid 4376282) was expressed <SEQ ID 303; cp6282>: 



1 MK1KKSFQYS 1»CQAKRFQNM LPNHFDPCLQ 
51 SKYQQKTFSS LLKEETCSLN RAKQHIilj YK I 
101 IPMSPCL* 



PVNLQLKQDR LAYGELIILL 
LRDFNTMQHL RSLGLNGWGE 



The cp6895 nucleotide sequence <SEQ ID 302> is: 



1 MSLLiNLPSSQ DSASEDSTSQ SQIFDPIRNR ELVSTPEEKV RQRLLSFLMH 

51 KLNYPKKLII IEKELKTLFP LLMRKGTLIP KRRPDILIIT PPTYTDAQGN 

101 THNLGDPKPL LLIECKALAV NQNALKQLLS YNYSIGATCI AMAGKHSQVS 

151 ALFNPKTQTL DFYPGL PEYS QLLNYFISLN L* 
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The cp6282 nucleotide sequence <SEQ ID 304> is: 

1 ATGTCCTTAT TGAACCTTCC CTCAAGCCAG GATTCTGCAT CTGAGGACTC 

51 CACATCGCAA TCTCAAATCT TCGATCCCAT TAGAAATCGG GAGTTAGTTT 

101 CTACTCCCGA AGAAAAAGTC CGCCAAAGGT TGCTCTCCTT CCTAATGCAT 

151 AAGCTGAACT ACCC TAAGAA ACTCATCATC ATAGAAAAAG AACTCAAAAC 

201 TCTTTTTCCT CTGCTTATGC GTAAAGGAAC CCTAATCCCA AAACGCCGCC 

251 CAGATATTCT CATCATCACT CCCCCCACAT ACACAGACGC ACAGGGAAAC 

301 ACTCACAACC TAGGCGACCC AAAACCCCTG CTACTTATCG AATGTAAGGC 

351 CTTAGCCGTA AACCAAAATG CACTCAAACA ACTCCTTAGC TATAACTACT 

401 CTATCGGAGC CACCTGCATT GCTATGGCAG GGAAACACTC TCAAGTGTCA 

451 GCTCTCTTCA ATCCAAAAAC ACAAACTCTT GATTTTTATC CTGGCCTCCC 

501 AGAGTATTCC CAACTCCTAA ACTACTTTAT TTCTTTAAAC TTATAG 

The PSORT algorithm predicts cytoplasm (0.362). 

The following C.pneumoniae protein (pid 4377373) was also expressed <SEQ ID 305; cp7373>: 

1 MSTTTVKHF I HTASRWEFVL KE I VASNYVJH AQWINTIiSFL ENSGAKKISA 

51 SEHPTEVKEE VLKHAAEEFR HGHYLKTQ I S RISETSLPDY TSKNLLGGLL 

101 TKYYIiHIiLDL RTCRVLENEY SLSGQTLKTA AYILVTYAIE LRASELYPLY 

151 HDILKEAQSK ITVKSIIIiEE QGHLQEMERE LKDL PHGE ELt LGYACQFEGE 

201 LCLQFVERLE QMIFDPSSTF TKF* 

The cp7373 nucleotide sequence <SEQ ID 306> is: 

1 ATGTCTACAA CCACAGTAAA ACACTTTATC CACACAGCCT CTCGTTGGGA 

51 GCCCGTTCTC AAAGAGATCG TAGCTTCCAA CTATTGGCAT GCACAATGGA 

101 TAAATACCCT GTCCTTTTTA GAAAATAGTG GAGCAAAAAA AATCTCCGCA 

151 AGTGAACATC CTACGGAGGT AAAGGAAGAA GTTTTAAAAC ATGCTGCTGA 

201 AGAATTTCGT CATGGTCACT ATCTAAAAAC TCAGATTTCT AGAATCT C AG 

251 AGACTTCTCT CCCTGACTAT ACATCTAAAA ATCTTCTGGG AGGCTTACTT 

301 ACAAAATATT ACCTCCATCT TCTAGATTTA AGGACGTGCC GAGTACTGGA 

351 AAATGAATAC TCCCTATCGG GACAAACGTT AAAAACTGCA GCGTATATTT 

401 TAGTTACCTA CGCAATCGAA CTTCGTGCTT CTGAACTTTA TCCTCTGTAT 

451 CACGATATTC TGAAAGAAGC TCAAAGTAAA ATAACGGTAA AATCCATTAT 

501 CTTAGAAGAG CAAGGCCATC TGCAAGAGAT GGAACGTGAA CTTAAAGATC 

551 TCCCCCACGG GGAGGAACTC TTAGGCTATG CTTGCCAATT CGAAGGGGAG 

601 CTTTGCTTGC AGTTTGTAGA GAGATTAGAA CAAATGATCT TCGATCCTTC 

651 CTCGACTTTT ACAAAGTTCT AG 

The PSORT algorithm predicts cytoplasm (0.1069). 

The proteins were expressed in E.coli and purified as his-tag products (Figure 152A; 6282 = lanes 8 
& 9; 7373 = lanes 2-4). The recombinant proteins were used to immunise mice, whose sera were 
used in Western blots (Figures 152B & 153) and for FACS analysis. 

These experiments show that cp6282 & cp7373 are surface-exposed and immunoaccessible proteins 
and that they are useful immunogens. These properties are not evident from the sequence alone. 

Example 154 , 
Example 155 , 
Example 156 , 
Example 157 and 
Example 158 

The following C.pneumoniae protein (pid 4376412) was expressed <SEQ ID 307; cp6412>: 

1 MSSSEWFQT VHGLGFGGLS SKSWPFKKS LSDAPRWCS ILVLTLGLGA 

51 LVCGIAITCW CVPGVILMGG ICAIVLGAIS LALSLFWLWG LFSNCCGSKR 

101 VXi PG EG LLRD KLLDGGFSRA APSGMGLPGD GSPRASTPSC LEELQAEIQA 

151 VTQAIDQMSD D* 

The cp6412 nucleotide sequence <SEQ ID 308> is: 
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1 ATGAGCAGTT CGGAAGTTGT TTTCCAGACA GTTCATGGCC TTGGCTTTGG 

51 TGGATTGTCT TCAAAAAGTG TTGTCCCTTT TAAGAAAAGT CTTTCGGATG 

101 CGCCCCGTGT TGTGTGCTCG ATTTTAGTTT TGACTCTGGG GTTGGGAGCG 

151 CTTGTTTGTG GTATTGCCAT TACTTGTTGG TGTGTCCCGG GAGTTATTTT 

201 AATGGGGGGA ATTTGCGCTA TAGTTTTAGG TGCAATTTCT TTAGCTTTAA 

251 GTCTATTTTG GTTGTGGGGT TTATTTTCTA ATTGTTGTGG TTCTAAGAGA 

301 GTTTTACCGG GTGAGGGATT GCTACGGGAT AAGC TTTTAG ATGGTGGATT 

351 TTCAAGAGCG GCACCTTCAG GAATGGGACT TCCGGGTGAT GGATCTCCAA 

401 GAGCGTCAAC GCCATCTTGC CTAGAGGAAC TTCAAGCAGA GATACAGGCA 

451 GTTACTCAAG CTATCGATCA GATGTCAGAT GATTGA 

The PSORT algorithm predicts inner membrane (0.4864). 

The following C.pneumoniae protein (pid 437 6431) was also expressed <SEQ ID 309; cp6431>: 

1 liRAGGSLVTT YPKEGQRLRS PEQLRVLiDDli VQSYPNHLHA IELDCGAIPQ 
51 DLIGATYIIT FADFSTYILS IiRSYQANSPS DDTWGIWFGS IDDPVQAVIS 
101 FLKDHGFALP STLAQDPUJC TNK* 

The cp6431 nucleotide sequence <SEQ ID 310> is: 

1 TTGCGAGCAG GAGGTAGTCT TGTTACAACA TACCCTAAGG AAGGTCAGAG 

51 ATTGCGCTCC CCAGAACAGT TAAGAGTTCT GGATGATTTA GTGCAAAGCT 

101 ATC CAAATC A CCTACATGCG ATTGAACTTG ATTGTGGTGC AATCCCTCAA 

151 GATTTGATCG GAGCCACCTA TATCATCACG TTCGCCGATT TTTCCACCTA 

201 TATTCTCTCT TTAAGAAGCT ACCAAGCCAA TTCTCCCTCC GATGATACAT 

251 GGGGGATTTG GTTTGGATCT ATTGACGATC CTGTTCAAGC AGTCATATCA 

301 TTTTTAAAAG ATCATGGATT TGCTCTTCCC TCGACCTTAG CTCAAGATCC 

351 TTTGCTTTGT ACTAACAAGT AA 

The PSORT algorithm predicts cytoplasm (0.2115). 

The following C.pneumoniae protein (pid 4376443) was also expressed <SEQ ID 311; cp6443>: 

1 MIMTTISNSP SPALNPELSL IPPPTLVSSG TQTSLAYTIP AQGRRSTLRI 

51 ILDIFIIILG LATIISTFIV IFFLNGLNLL STPSIISSSC LIIVGLLFLI 

101 MGLYFMISSL DQGLVGLLQK ELSQAEEREE EYIQEIEALR GA PRAE S PTE 

151 SPSTWL* 

The cp6443 nucleotide sequence <SEQ ID 312> is: 

1 ATGATTATGA CTACTATATC TAACTCACCC TCCCCTGCAT TGAATCCCGA 

51 ACTTTCCCTT ATTCCTCCAC CAACACTTGT ATCTTCAGGT ACGCAAACAT 

101 CTCTAGCTTA TACGATCCCC GCACAAGGAC GAAGATCCAC CCTACGTATT 

151 ATATTAGATA TATTCATTAT CATTCTTGGT TTAGCTACGA TCATTTCTAC 

201 CTTTATTGTT ATTTTCTTTT TAAATGGGCT GAACTTGCTC TCGACCCCAT 

251 CTATTATCTC TTCGTCATGT TTAATCATTG TTGGATTGCT TTTTTTGATT 

301 ATGGGGTTAT ATTTCATGAT CTCGAGTTTG GATCAGGGGC TTGTAGGCCT 

351 TCTGCAAAAG GAACTCTCTC AAGCCGAAGA AAGAGAAGAA GAGTATATCC 

401 AGGAAATCGA AGCTTTAAGA GGAGCTCCTA GAGCAGAATC TCCCACAGAG 

451 TCTCCTAGTA CCTGGTTATG A 

The PSORT algorithm predicts inner membrane (0.5585). 

The following C.pneumoniae protein (pid 4376496) was also expressed <SEQ ID 313; cp6496>: 

1 MLIGRYSSDD QFTEATKNTP TIIKLGFVRD NLEGLTNPIS EIVSETSSSI 
51 KDSVLRSIiPI LGSILGCARL YSTLSTNDPL DETQEK IWHT IFGALETLGL 
101 GILILLFKII FVIIiHCIFHL VIGFCK* 

The cp6496 nucleotide sequence <SEQ ID 314> is: 

1 ATGCTAATAG GCAGATACAG TAGTGATGAC CAATTCACTG AAGCAACAAA 

51 AAACACCCCA AC C AT AATT A AGCTAGGTTT TGTTAGAGAT AATCTCGAGG 

101 GATTAACGAA CCCTATCTCT GAAATCGTCT CGGAAACCTC CTCTTCTATT 

151 AAAGATTCCG TTCTTCGCTC TCTTCCTATT TTAGGGTCCA TTTTAGGATG 

201 CGCCCGACTT TACAGCACAC TCTCTACAAA TGATCCTCTT GACGAAACTC 

251 AAGAAAAGAT TTGGCACACT ATATTTGGAG CCTTAGAAAC CTTAGGCTTA 

301 GGGATTCTCA TCCTCTTATT T AAAATT AT T TTTGTTATAT TACACTGCAT 

351 ATTTCATCTA GTTATTGGGT TCTGCAAATA A 
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The PSORT algorithm predicts inner membrane (0.5989). 

The following C.pnewnoniae protein (pid 4376654) was also expressed <SEQ ID 315; cp6654>: 

1 MKTKMNSRKK AGQWAIFNSP TPGVSSTLVL AWTPWGYYDK DVQDILERKD 
51 PMSSSLSEKD SKEFLKNLFV DLLENGFTSV HIHAEEAFTP LDHTGKPHFK 
101 RDNVYL PGKL LGALNEAAVQ ANVSADTQFT LFLTQDECNP FHDKKRG* 

The cp6654 nucleotide sequence <SEQ ID 316> is: 

1 ATGAAAACTA AAATGAACTC TAGAAAAAAA GCAGGTCAAT GGGCAATTTT 

51 CAATTCTCCA ACTCCTGGTG TCAGTTCAAC TTTAGTTTTA GCATGGACTC 

101 CTTGGGGTTA TTACGACAAG GATGTACAAG ATATCTTAGA AAGAAAAGAT 

151 CCGATGAGCT CTTCGCTTTC TGAAAAAGAC TCAAAGGAGT TCTTGAAAAA 

201 TCTGTTTGTA GATCTCTTAG AAAATGGCTT CACATCAGTA CATATTCACG 

251 CAGAAGAAGC TTTCACTCCT CTTGATCATA CCGGGAAACC TCACTTTAAA 

301 AGAGACAATG TGTACTTACC CGGAAAGTTG TTAGGCGCCT TGAATGAGGC 

351 TGCGGTACAA GCCAATGTAA GTGCGGATAC TCAATTTACA TTGTTCCTTA 

401 CTCAAGATGA GTGCAATCCT TTTCATGATA AGAAAAGAGG TTAA 

The PSORT algorithm predicts cytoplasm (0.0730). 

The proteins were expressed in E.coli and purified as his-tag products (Figure 154A; 6412 = lanes 
2-3; 6431 = lanes 11-12; 6443 = lanes 5-6; 6496 = lanes 8-9; 6654 = lane 10; markers in lanes 1, 4, 
7). The recombinant proteins were used to immunise mice, whose sera were used in Western blots 
(Figures 154B, 155, 156, 157 & 158) and for FACS analysis. 

These experiments show that cp6412, cp6431, cp6443, cp6496 & cp6654 are surface-exposed and 
immunoaccessible proteins and that they are useful immunogens. These properties are not evident 
from their sequences alone. 

Example 159 and 
Example 160 

The following C.pnewnoniae protein (pid 437 6477) was expressed <SEQ ID 317; cp6477>: 

1 LLKFFLVC EE LCILTVATHR ALiLETPLALS FFKELKTKYV YRAKDILQLH 
51 NYKGFTILNT SPLCS* 

The cp6477 nucleotide sequence <SEQ ID 318> is: 

1 TTGCTAAAGT TCTTTCTAGT ATGTGAAGAG TTATGTATAC TTAC TGTTGC ' 

51 TACACATAGA GCTCTCTTAG AAACTCCTTT AGCTCTATCA TTTTTTAAAG 

101 AACTTAAGAC AAAATATGTC TACAGGGCGA AAGACATACT ACAACTACAT 

151 AACTATAAAG GATTTACTAT CCTTAATACA TCACCGTTAT GTTCTTAA 

The PSORT algorithm predicts inner membrane (0.128). 

The following C.pneumoniae protein (pid 437643 5) was also expressed <SEQ ID 319; cp6435>: 

1 LWSHFPRGFF MLPFCPTILL AKPFLNSENY GLERLAATVD SYFDLGQSQI 
51 VFLSKQDQGI TVEELSAKDR KFKPGSMNCT LYTEDPILPA HNSFSNCSDI 
101 QMRTPISPIH * 

The cp6435 nucleotide sequence <SEQ ID 320> is: 

1 TTGTGGTCGC ATTTCCCAAG AGGATTTTTT ATGCTCCCTT TTTGCCCTAC 

51 CATCCTTCTT GCTAAACCTT TTTTAAATAG CGAGAATTAC GGCTTAGAAC 

101 GTTTAGCTGC AACCGTAGAT TCTTATTTTG ATCTGGGACA GTCTCAAATA 

151 GTCTTCCTAA GCAAACAGGA TCAAGGAATC AC TGTGGAAG AATTGAGTGC 

201 TAAAGATAGG AAATTCAAGC CAGGCTCTAT GAACTGTACA CTGTACACTG 

251 AAGATCCTAT CTTACCTGCT CATAATTCCT TTAGTAATTG CTCTGATATT 

301 CAAATGCGTA CTCCGATTAG CCCTATACAT TAA 
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The PSORT algorithm predicts periplasmic space (0.4044). 

The proteins were expressed in E.coli and purified as his-tag products (Figure 159A; 6435 = lanes 
2-4; 6477 = lanes 5-7). The recombinant proteins were used to immunise mice, whose sera were used 
in Western blots (Figures 159B & 160) and for FACS analysis. 

These experiments show that cp6477 & cp6435 are surface-exposed and immunoaccessible proteins 
and that they are useful immunogens. These properties are not evident from the sequences alone. 

Example 161 and 
Example 162 and 
Example 163 

The following C.pneumoniae protein (pid 4376441) was expressed <SEQ ID 321; cp6441>: 

1 VEAGANVLVI DTAHAHSKGV FQTVLEIKSQ FPQISLWGN LVTAEAAVSL . 

51 AEIGVDAVKV GIGPGSICTT RIVSGVGYPQ ITAITNVAKA LKNSAVTVIA 

101 DGRIRYSGDV VKALAAGADC VMLGSLLAGT DEAPGDIVSI DEKLFKRYRG 

151 MGSLGAMKQG SADRYFQTQG QKKLV PGGVE GLVAYKGSVH DVLYQ I LGG I 

201 RSGMGYVGAE TLKDLKTKAS FVRITESGRA ESHIHNIYKV QPTLNY 

The cp6441 nucleotide sequence <SEQ ID 322> is: 

1 GTGGAAGCTG GAGCAAATGT TCTAGTCATT GACACAGCTC ATGCACACTC 

51 TAAAGGAGTA TTCCAAACAG TTTTAGAAAT AAAATCC C AG TTCCCACAAA 

101 TTTCTTTAGT TGTAGGGAAT C TTGTT AC AG CTGAAGCCGC AGTTTCCTTA 

151 GCTGAGATTG GAGTTGACGC TGTAAAGGTA GGTATTGGCC CAGGATCTAT 

201 CTGTACAACT AGAATCGTTT CAGGGGTCGG TTATCCACAA ATTACTGCCA 

251 TTACAAACGT AGCAAAAGCT CTTAAAAACT CTGCCGTGAC TGTAATTGCT 

301 GATGGGAGAA TCCGC TATTC TGGAGATGTG GTAAAAGCAT TAG C AG CAGG 

351 AGCAGACTGT GTCATGCTAG GAAGTTTGCT TGCAGGGACT GATGAAGCTC 

401 CTGGGGATAT CGTTTCTATC GATGAGAAGC TTTTTAAAAG GTACCGCGGC 

451 ATGGGATCTT TAGGCGCTAT GAAACAAGGA AGTGCTGACC GGTATTTTCA 

501 AACACAGGGA CAGAAAAAGC TGGTTCCTGG GGGAGTTGAA GGACTAGTCG 

551 CTTATAAAGG CTCTGTCCAC GATGTCCTCT ATCAAATTTT AGGAGGAATA 

601 CGCTCAGGTA TGGGGTATGT TGGAGCTGAA ACTCTCAAAG ATTTAAAAAC 

651 TAAGGCTTCC TTTGTTCGAA TTACTGAATC TGGAAGAGCT GAAAGTCATA 

701 TTCATAATAT TTACAAAGTT CAACCAACCT TAAATTATTA A 

The PSORT algorithm predicts bacterial inner membrane (0.132). 

The following C.pneumoniae protein (pid 4376748) was also expressed <SEQ ID 323; cp6748>: 

1 LFSEGTALNL FRIFAPLRNR VTTEYSRARQ PDLHRIAIVY IGVLDSESSK 

51 ILERLISYMS CIYSESQMYL RFFMGKNVNQ SAVLSKLHVE NLHIRCGFFS 

101 EDAVPESEPF DLSIYVHTDR SCPLPTKKRS SSWEIiQTVEIi PESIYPQSEF 

151 LLMRPRMLS* 

The cp6748 nucleotide sequence <SEQ ID 324> is: 

1 TTGTTCTCTG AGGGGACAGC TCTAAATTTA TTTCGTATAT TTGCTCCACT 

51 ACGCAACCGT GTGACTACAG AATACAGTCG TGCTAGGCAA CCCGACCTAC 

101 ATAGAATTGC CATCGTCTAT ATAGGAGTTC TCGATTCAGA AAGTTCCAAG 

151 ATCCTAGAGC GGCTAATCTC TTATATGAGT TGTATCTATT CTGAATCGCA 

201 AATGTATTTA AGATTCTTTA TGGGCAAGAA TGTAAATCAA AGTGCTGTAC 

251 TCTCAAAATT ACATGTAGAA AATCTGCACA TCCGTTGTGG GTTTTTCAGC 

301 GAGGATGCTG TTCCAGAGAG TGAGCCCTTC GATCTCTCCA TCTACGTGCA 

351 CACAGATCGT AGCTGTCCTC TCCCTACGAA AAAACGGAGC AGCTCCTGGG 

401 AACTCCAAAC TGTAGAACTC CCAGAGTCAA TATATCCACA GTCGGAATTC 

451 CTATTGATGA GACCTCGAAT GCTTTCGTAG 

The PSORT algorithm predicts cytoplasm (0.170). 

The following C.pneumoniae protein (pid 4376881) was also expressed <SEQ ID 325; cp6881>: 
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1 MRPHRKHVSS KSLALKQSAS THVEITTKAF RLSMPLiKQLI LEKSDHLPPM 

51 ETIRVVLTSH KDKIiGTEVHV VASHGKEILQ TKVHNANPYT AVINAFKKIR 

101 TMANKHSNKR KDRTKHDLGL AAKEERIAIQ EEQBDRL SNE WL PVEGLDAW 

151 DSLKTIiGYVP ASAKKKISKK KMSIRMLSQD EAIRQLESAA ENFLIFLNEQ 
201 EHKIQCIYKK HDGNYVLIEP SLKPGFCI* 

The cp6881 nucleotide sequence <SEQ ID 326> is: 

1 ATGAGACCTC ATCGTAAACA CGTATCATCT AAAAGCTTAG C TTTAAAGC A 

51 ATCTGCATCA ACTCATGTAG AGATCACAAC AAAAGCCTTT CGTCTCTCTA 

101 TGCCTCTAAA ACAGCTGATC CTAGAGAAAA GCGACCACCT CCCCCCTATG 

151 GAAACAATCC GTGTGGTGCT AACCTCTCAT AAAGATAAGC TAGGCACCGA 

201 GGTGCATGTT GTAGCTTCTC ATGGCAAAGA AATCCTTCAA AC TAAGGTTC 

251 ATAACGCAAA CCCATACACT GCAGTGATCA ATGCTTTTAA GAAAATCCGC 

301 ACCATGGCAA ATAAGCACTC CAATAAACGT AAAGACAGGA CAAAACATGA 

351 TCTAGGTCTT GCAGCAAAAG AAGAACGTAT CGCAATAC AG GAAGAACAAG 

401 AAGATCGCCT TAGCAACGAG TGGCTTCCTG TCGAAGGCCT CGATGCCTGG 

451 GATTCTCTAA AAACTCTTGG GTATGTTCCC GCATCAGCGA AAAAGAAGAT 

501 CTCCAAGAAA AAGATGAGCA TTCGTATGCT ATCTCAAGAC GAGGCTATCC 

551 GCCAGCTAGA GTCTGCCGCA GAAAACTTCC TGATCTTCTT GAACGAGCAA 

601 GAGCATAAAA TCC AATG CAT TTATAAAAAA CATGACGGCA AC TATGTCCT 

651 TATTGAACCT TCCCTCAAGC CAGGATTCTG CATCTGA 

The PSORT algorithm predicts cytoplasm (0.249). 

The proteins were expressed in Kcoli and purified as his-tag products (Figure 161A; 6441= lanes 
7-9; 6748 = lanes 2-3; 6881 = lanes 4-6). The recombinant protein was used to immunise mice, 
whose sera were used in Western blots (Figures 161B, 162 & 163) and for FACS analysis. 

These experiments show that cp6441, cp6748 & cp6881 are surface-exposed and immunoaccessible 
proteins and that they are useful immunogens. These properties are not evident from the sequence 
alone. 



Example 164 and 
Example 165 
Example 166 



The following C.pneumoniae protein (pid 4376444) was expressed <SEQ ID 327; cp6444>: 

1 MEQPNCVIQD TTTVLYALNS FDPRLSDDTH RLGKQS PLEA ENALGEF I EG 
51 LDTNSFPIiEE VAIPILPGYH PKFYLSFIDR DDQGVHYEVL DGVFLKTVAA 
101 CIIENSFLTD SMSPELLSEV KEALKR* 

The cp6444 nucleotide sequence <SEQ ID 328> is: 

1 ATGGAGCAAC CCAATTGTGT GATTCAGGAT ACTACAACTG TTTTGTATGC 

51 CTTAAATAGC TTTGATCCTA GACTTAGTGA TGACACTCAC AGACTTGGGA 

101 AGCAATCACC TCTTGAAGCA GAAAATGCTC TTGGAGAATT TATTGAAGGT 

151 TTGGATACAA ATAGCTTTCC TTTAGAGGAA GTTGCCATTC CCATCCTGCC 

201 AGGTTATCAC CCTAAGTTTT ATTTATCTTT CATAGATAGG GACGATCAAG 

251 GTGTCCACTA TGAAGTTTTA GATGGCGTAT TTTTAAAGAC AGTCGCTGCT 

301 TGTATTATAG AGAACTCCTT CTTAACTGAT TCTATGAGCC CGGAGCTTCT 

351 CAGCGAAGTT AAGGAAGCTC TGAAACGATG A 

The PSORT algorithm predicts cytoplasm (0.2031). 



The following C.pneumoniae protein (pid 4376413) was also expressed <SEQ ID 329; cp6413>: 

1 MAVQSIKEAV TSAATSVGCV NCSREAIPAF NTEERATSIA RSVIAAIIAV 
51 VAISLLGLGL WLAGCCPLG MAAGAITMLL GVALLAWAIL ITLRLLNIPK 
101 AEIPSPGNNG EPNERNSATP PLEGGVAGEA GRGGGSPLTQ LDUNSGAGS* 

The cp6413 nucleotide sequence <SEQ ID 330> is: 

1 ATGGCTGTTC AATCTATAAA AGAAGCCGTA ACATCAGCCG CAACATCAGT 
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51 AGGATGTGTA AACTGTTCTA GAGAGGCTAT ACCAGCATTT AATACAGAGG 

101 AGAGAGCAAC GAGTATTGCT AGATCTGTTA TAGCAGCTAT CATTGCTGTT 

151 GTAGCTATCT CCTTACTCGG ACTAGGTCTT GTAGTTCTTG CTGGTTGCTG 

201 TCCTTTAGGA ATGGCTGCGG GTGCTATAAC AATGCTGCTG GGTGTAGCAT 

251 TATTAGCTTG GGCAATACTG ATTACTTTGA GACTGC TTAA TATACCTAAG 

301 GCTGAAATAC CGAGTCCAGG GAACAACGGT GAGCC TAATG AAAGAAATTC 

351 AGCAACTCCT CCTCTAGAGG GTGGTGTTGC AGGAGAAGCC GGTCGCGGCG 

401 GGGGGTCACC TTTAACCCAA CTTGATCTCA ATTCAGGGGC GGGAAGTTAG 

The PSORT algorithm predicts inner membrane (0.6180). 



The following C.pneumoniae protein (pid 4377391) was also expressed <SEQ ID 331; cp7391>: 

1 MMLRVIELPL LPIKQALEKA FVQYNSYKAK LTKVEPCFRE SPAYITSEER 

51 LQSLDQTLER AYKEYQKRFQ EPSRLESEVS GCREHLREQV KQFETQGLDli 

101 IKEELIFVSD VLFRKMVSCL VSTVHVPFME FYYEYFELHR LRLRAQWMAN 

151 AEIYSKVRKA FPEMLKETLE KAKAPREEEY WLLCEERKSK EKRIj ILNK IE 

201 AAQQRVKDLE PPPIKETGKQ KLRKKEYSFFI RLKS* 

The cp7391 nucleotide sequence <SEQ ID 332> is: 

1 ATGATGCTTC GTGTCATAGA GCTTCCACTA CTTCCTATAA AGCAAGCGTT 
51 GGAGAAGGCT TTTGTACAAT ATAATAGCTA CAAAGCGAAG TTAACCAAGG 
101 TAGAACCTTG CTTTAGAGAG AGCCCTGCCT ATATAACTAG CGAAGAGCGA 
151 CTCCAGAGTT TGGATCAGAC TTTAGAACGT GCGTACAAAG AGTACCAGAA 
201 GAGATTCCAG GAGCCTTCAC GTTTGGAATC GGAAGTAAGT GGATGTAGAG 
251 AGCATCTTAG AGAGCAGGTA AAACAATTTG AAACTCAAGG ACTAGACTTG 
301 ATCAAAGAAG AGCTTATTTT TGTTAGTGAT GTGTTATTCC GAAAAATGGT 
351 CAGTTGTCTA GTGTCGACAG TGCATGTTCC CTTTATGGAG TTTTATTATG 
401 AGTATTTTGA GTTGCATAGA TTGAGGTTGC GGGCCCAATG GATGGCGAAT 
451 GCCGAGATTT ATAGCAAAGT TAGAAAAGCA TTCCCAGAGA TGTTGAAGGA 
501 GACCTTAGAA AAAGCTAAGG CTCCCAGAGA AGAAGAGTAT TGGTTACTTT 
551 GCGAGGAGAG AAAGAGTAAG GAGAAGCGTT TGATTCTCAA CAAGATAGAG 
601 GCAGCTCAGC AGCGGGTAAA AGATTTAGAA CCTCCTCCTA TTAAAGAGAC 
651 AGGGAAACAG AAACGGAAGA AAGAATATTC GTTTTTCATT CGATTAAAAT 
701 CGTGA 

The PSORT algorithm predicts inner membrane (0.1489). 

The proteins were expressed in Rcoli and purified as his-tag and GST-fusion products (Figure 164A; 
6444=lanes 1 1-12; 7391=lanes 2-3; 6413=lanes 4-6). The recombinant protein was used to immunise 
mice, whose sera were used in Western blots (Figures 164B, 165 & 166) and for FACS analysis. 

These experiments show that cp6444, cp6413 & cp7391 are surface-exposed and immunoaccessible 
proteins and that they are useful iminunogens. These properties are not evident from the sequence 
alone. 



Example 167 , 
Example 168 , 
Example 169 and 
Example 170 

The following C.pneumoniae protein (pid 4376463) was expressed <SEQ ID 333; cp6463>: 

1 MKKKVTIDEA LKEILrRLEGA ATQEELCAKL LAQGFATTQS SVSRWLRKIQ 
51 AVKVAGERGA RYSLPSSTEK TTTRHLVLSI RHNASLIVIR TVPGSASWIA 
101 ALLDQGLKDE ILGTIAGDDT IFVTPIDEGR LPLLMVSIAN LLQVFLD* 

The cp6463 nucleotide sequence <SEQ ID 334> is: 

1 ATGAAAAAAA AAGTAACTAT AGATGAGGCT TTAAAAGAAA TTTOACGTCT 

51 TGAAGGAGCG GCAACTCAGG AGGAATTATG TGCAAAACTC TTAGCTCAAG 

101 GTTTTGCTAC AACCCAGTCG TCTGTATCTC GTTGGCTACG AAAGATTCAG 

151 GCTGTAAAGG TTGCTGGAGA GCGTGGTGCT CGTTATTCTT TACCCTCTTC 
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201 AACAGAGAAG ACCACGACCC GTCATTTGGT GCTCTCTATT CGCCATAACG 
251 CCTCTCTTAT TGTAATTCGT ACGGTTCCTG GTTCAGCTTC TTGGATCGCT 
301 GCTTTGTTAG ATCAAGGGCT CAAAGATGAA ATTCTTGGAA CTTTGGCAGG 
351 AG ATGAC AC G ATTTTTGTCA CTCCTATAGA TGAAGGGAGG CTCCCATTGT 
401 TGATGGTTTC GATTGCAAAT TTACTGCAAG ' TTTTCTTGGA TTAA 

The PSORT algorithm predicts inner membrane (0.1510). 

The following C.pneumoniae protein (pid 4376540) was also expressed <SEQ ID 335; cp6540>: 

1 MSQCQSSSTS TWEWMKSFVP NWKNPTPPLS PIPSEDEFIL AYEPFVLPKT 
51 DPENAQANPP GTSTPNVENG IDDLNPLLGQ PNEQNNANNP GTSGSNPTSL 
101 PAPERLPETE ENSQEEEQGS QNNEDLIG* 

The cp6540 nucleotide sequence <SEQ ID 336> is: 

1 ATGTCTCAAT GTCAGAGTAG CAGTACATCT ACCTGGGAAT GGATGAAATC 

51 TTTTGTGCCA AACTGGAAGA ATCCAACTCC CCCCTTATCT CCTATACCTT 

101 CTGAGGACGA ATTTATATTA GCATACGAGC CATTTGTTCT ACCGAAAACA 

151 GATCCAGAAA ACGCACAAGC TAATC CTCC A GGCACATCTA CACCGAATGT 

201 AGAAAACGGG ATCGATGATC TCAACCCTCT TCTGGGGCAA CCCAACGAAC 

251 AAAACAATGC CAACAATCCA GGAACTTCTG GATCTAATCC TACATCTCTA 

301 CCCGCCCCCG AACGACTCCC TGAAACTGAA GAGAACAGCC AAGAAGAAGA 

351 ACAAGGATCT CAAAATAATG AGGATCTTAT AGGATAA 

The PSORT algorithm predicts cytoplasm (0.3086). 

The following C.pneumoniae protein (pid 437 6743) was also expressed <SEQ ID 337; cp6743>: 

1 LREEGSVSFR EYFRAYMCDK IVAQKNFLFT LDAVIKQAGW RSQEKLNLFY 
51 VESQALGREI KVSLEEYIQS MVG ILGSQRT KKSFKFSVDF TPLEQALQER 
101 CSSDDDEDAT AT STATG AT A SPTDMHEDE* 

The cp6743 nucleotide sequence <SEQ ID 338> is: 

1 TTGAGAGAAG AAGGTAGTGT TTCTTTCAGA GAATATTTCA GAGCCTATAT 

51 GTGTGATAAA ATCGTGGCAC AGAAGAACTT CTTATTTACT TTAGACGCTG 

101 TAATTAAACA GGCCGGTTGG AGATCACAAG AGAAACTCAA TTTATTTTAT 

151 GTTGAAAGTC AGGCTTTAGG AAGAGAAATC AAAGTCAGCT TAGAGGAATA 

201 TATTCAGAGT ATGGTCGGGA TTTTGGGATC TCAGAGAACC AAGAAAAGCT 

251 TTAAGTTTTC TGTCGACTTT ACCCCTTTAG AGCAGGCTCT ACAAGAAAGA 

301 TGCTCTTCTG ATGATGACGA AGATGCAACA GCAACTTCGA CCGCTACAGG 

351 GGCAACAGCA TCTCCGACTG ACATGCACGA AGATGAGTAA 

The PSORT algorithm predicts cytoplasm (0.2769). 

The following C.pneumoniae protein (pid 4377041) was also expressed <SEQ ID 339; cp7041>: 

1 MLMMLMMIIG I TGGSGAGKT TLTQNIKEIF GEDVSVICQD NYYKDRSHYT 

51 PEERANLIWD HPDAFDNDLIi ISDIKRLKNN EIVQAPVFDF VIiGNRSKTE I 

101 ETIYPSKVIL VEGILVFENQ ELRDLMDIRI FVDTDADERI LRRMVRDVQE 

151 QGDSVDCIMS RYLSMVKPMH EKFIEPTRKY ADIIVHGNYR QNWTNILSQ 

201 K I KNHLENAL ESDETYYMVN SK* 

The cp7041 nucleotide sequence <SEQ ID 340> is: 

1 ATGTTGATGA TGCTTATGAT GATTATTGGA ATTACAGGAG GTTCTGGAGC 

51 TGGGAAAACC ACCCTAACCC AAAACATTAA AGAAATTTTC GGTGAGGATG 

101 TGAGTGTTAT CTGCCAAGAT AATTATTACA AAGATAGATC TCATTATACT 

151 CCTGAAGAAC GTG CCAATTT AATTTGGGAT CATC CGGACG CCTTTGATAA 

201 TGACTTATTA ATTTCAGACA TAAAACGTCT AAAAAATAAT GAGATTGTCC 

251 AAGCCCCAGT TTTTGATTTT GTTTTAGGTA ATCGATCTAA AACGGAGATA 

3 01 GAAACGATCT ATCCATCTAA AGTTATTCTT GTTGAAGGTA TTCTGGTCTT 

351 TGAAAATCAA GAACTTAGAG ATCTTATGGA TATTAGGATC TTTGTAGACA 

401 CCGATGCTGA TGAAAGGATA CTACGCCGTA TGGTTCGAGA TGTTCAAGAA 

451 CAAGGAGATA GCGTGGACTG CATCATGTCT CGTTATCTTT CTATGGTAAA 

501 GCCTATGCAT GAGAAATTTA TAGAGCCGAC TCGGAAATAT GCTGATATCA 

551 TTGTACATGG AAATTACCGA CAAAACGTAG TAACAAATAT TTTGTCACAG 

601 AAAATTAAAA ATCATTTAGA GAATGCCCTG GAAAGCGATG AGACGTATTA 

651 TATGGTCAAC TCTAAGTAA 
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The PSORT algorithm predicts inner membrane (0.1022). 

The proteins were expressed in E.coli and purified as his-tag products (Figure 167A; 6463 = lanes 
2-4; 6540 = lanes 5-7; 6743 = lanes 8-9; 7041 = lanes 10-1 1). The recombinant proteins were used to 
immunise mice, whose sera were used in Western blots (Figures 167B, 168, 169 & 170) and for 
FACS analysis. 

These experiments show that cp6463, cp6540, cp6743 & cp7041 are surface-exposed and 
immunoaccessible proteins and that they are useful immunogens. These properties are not evident 
from the sequence alone. 

Example 171 and 
Example 172 and 
Example 173 

The following Cpneumoniae protein (pid 4376632) was expressed <SEQ ID 341; cp6632>: 

1 VQLFQYMNES GWDWLCDFDS QGEGFQLSRL VGLLHSSWAL YEAKEQF YL. P 
51 EVSLLTWEEL IEMQLLSKPT KHGVAKDLCN VFEKHFQRFR QYLGSLDLNQ 
101 RFENTFDNYP KYHLDRE* 

The cp6632 nucleotide sequence <SEQ ID 342> is: 

1 GTGCAATTAT TTCAATATAT GAATGAGTCC GGATGGGATT GGCTTTGTGA 
51 TTTTGATTCT CAAGGCGAGG GATTCCAGTT ATCACGTCTG GTTGGGCTGT 
101 TACATTCGTC CTGGGCATTA TACGAAGCAA AAGAGCAATT TTACCTTCCT 
151 GAGGTTTCTC TATTGACCTG GGAAGAACTG ATAGAAATGC AGTTATTAAG 
201 CAAACCAACA AAACACGGGG TTGCAAAAGA TCTTTGTAAT GTATTTGAAA 
251 AACACTTTCA AAGGTTTAGA CAGTACCTAG GTTCC TTAGA TCTAAATCAA 
301 AGGTTCGAAA ATACCTTCTT GAATTATCCT AAATACCATT TAGATAGGGA 
351 GTGA 

The PSORT algorithm predicts cytoplasm (0.3627). 

The following C.pneumoniae protein (pid 4376648) was also expressed <SEQ ID 343; cp6648>: 

1 MPVSSAPLPT SHRPSSGNLG LMEPNSKALK AKHQDKTTKT IKIiLVKIIiVA 
51 ILVIEVLGII AAFFI PGTPP ICIiIILGGIjI LTTVLCVLLL VIKLALVNKT 
101 EGTTAEQQIK RKLSSKSIS* 

The cp6648 nucleotide sequence <SEQ ID 344> is: 

1 ATGCCCGTGT cctcagcccc cctacccaca agccaccgcc cttcctctgg 

51 aaatctaggc ctcatggaac caaattccaa agctctaaaa gcaaagcatc 

101 aagataaaac gacgaagacg attaaacttt tagttaaaat ccttgttgcc 

151 attctagtaa tagaagtttt aggaataatt gcagctttct ttattcctgg 

201 gactcctccc atctgcttga ttatcctagg aggccttatt cttacaacag 

251 tactctgtgt gcttcttctt gttataaagc ttgcccttgt aaacaaaacc 

301 gaaggaacaa ctgctgaaca gcagataaaa cgtaaactct cttctaaaag 

351 TATTTCTTAG 

The PSORT algorithm predicts inner membrane (0.60?4). 

The following C.pneumoniae protein (pid 4376497) was also expressed <SEQ ID 345; cp6497>: 

1 MKPNSIIFLE NTKHYPDIFR EGFVRDRHGL MEASDWLLST EITIIRSILG 
51 AIPILGNILG AGRLYSVWYT SDEDWKKQW * 

The cp6497 nucleotide sequence <SEQ ID 346> is: 

1 ATGAAGCCAA ATAGTATTAT TTTTTTAGAA AATACTAAGC ATTATCCCGA 

51 CATCTTTCGA GAAGGATTTG TTCGTGATCG TCATGGACTA ATGGAAGCCT 

101 CGGATTGGTT ACTTTCTACG GAAATTACGA TCATTCGCTC CATTCTGGGA 

151 GCTATCCCTA TTTTAGGAAA TATTCTTGGA GCCGGACGAC TCTATAGCGT 
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201 TTGGTATACA AGTGACGAAG ATTGGAAAAA ACAAGTGGTT TGA 

The PSORT algorithm predicts inner membrane (0.145). 

The proteins were expressed in E.coli and purified as his-tag products (Figure 171 A; 6632 = lanes 
5-7; 6648 = lanes 8-10; 6497 = lanes 2-4). The recombinant proteins were used to immunise mice, 
whose sera were used in Western blots (Figures 171B, 172, 173) and for FACS analysis. 

These experiments show that cp6632, cp6648 and cp6497 are surface-exposed and 
immunoaccessible proteins and that they are useful immunogens. These properties are not evident 
from the sequence alone. 

Example 174 , 
Example 175 , 
Example 176 , 
Example 177 and 
Example 178 

The following C.pneumoniae protein (pid 43772 00) was expressed <SEQ ID 347; cp7200>: 



1 ATGCCCGTTC CTATAGATAA TTCCTCTCGC AACCTACAAG AAGTTCCAGA 

51 AAGCCTAGAA GACCTCGAAC AACACGCAGA AGAATCTCCT ACTCATCAAA 

101 GTGCAGAAAG CAGTTC TTTG CAACTGTCTC TAGCCTCCTC AGCAATTTCT 

151 AGTAGAGTAG AACAACTATC TTCCCTCGTC TTAGGAATGG AAAATTCAGA 

201 TTTCTCCTCT TTAAGAGACG TTCCTATCTT CTCAGCTATC TACGAATCTT 

251 CAACACACAC ACCTGTCCCC ACTCCTCTAG TTGGCGTGGG ATATATCAAC 

301 GGAAGTCAAT CAGGATACTA CGATACACAA AG AGAATC TC TTCACCTCAG 

351 CCAATTGTTA GGAAGCCGAA GAGTTGAAGT TGTCTATAAC CAAGGAAACT 

401 TCATGGAGGC CTCTTTGCTA AATCTGTGCC CCAGAAGACC TCGAAGAGAT 

451 CCCTCTCCAA TTTCTTTAGC TC T ATT AGAG CTCTGGGAAG CATTTTTTTT 

501 AGAACACCCC CCAGGTAGCA CTTTTAATCC AATATTTTTT TGGTAA 



The PSORT algorithm predicts cytoplasm (03672). 

The following C.pneumoniae protein (pid 4377235) was also expressed <SEQ ID 349; cp7235>: 



1 TTGAATTTTG TATCGACTCT GACCGGCTCC GATTTTTATG CTCCTGTTTT 

51 AGAAAAACTA GAAGAAGCTT TTGCAGATAC CACAGGACAG GTGATCCTTT 

101 TTTCTTCTTC TCCAGACTTT ATTGTCCACC CCATAGCGCA GCAACTCGGG 

151 ATTAGTTCTT GGTATGCGTC GTGTTATCGC GATCAGTCTG CAGAACAGAC 

201 GATCTATAAA AAATGTCTTA CAGGGGATAA AAAAGCGCAA ATTTTGAGTT 

251 ATATTAAAAA AATTAATCAA GCAAGAAGCC ATACCTTCTC CGACCATATT 

301 TTAGATCTTC CTTTTCTTAT GCTGGGAGAA GAGAAAACCG TCGTTCGCCC 

351 TCAGGGACGA CTCAAGAAAA TGGCAAAAAA ATATTACTGG AATATCGTTT 

401 AA 



1 MPVPIDNSSR NLQEVPESLE DLEQHAEE S P 

51 SRVEQLSSLV LGMENSDFSS LRDVPIFSAI 

101 GSQSGYYDTQ RESLHLSQLL GSRRVEWYN 

151 PSPISLALLE LWEAFFLEHP PGSTFNPIFF 



THQSAESSSL 
YESSTHTPVP 
QGNFMEASLL 
W* 



QLSLASSAIS 
TPLVGVGYIN 
NLCPRRPRRD 



The cp7200 nucleotide sequence <SEQ ID 348> is: 




IVHPIAQQLG 
ARSHTFSDHI 



The PSORT algorithm predicts cytoplasm (0.3214). 



The following C.pneumoniae protein (pid 4377268) was also expressed <SEQ ID 351 ; cp7268>: 

1 MMHRYFIPLL ALLIFSPSLV RAELQPSENR KGGWPTQLSC AEGSQLFCKF 
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51 EAAYNNAIEE GKPGILVFFS ERPTPEFADL TNGSFSLSTP IAKGFNWVL 
101 CPGLISPLDF FHKMDPVILY MGSFLEMFPE VEAVSGPRLC YILIDEQGGA 
151 QCQAVLPLET KN* 



1 ATGATGCACC GTTATTTTAT TCCTTTATTA GCACTTCTCA TTTTCTCTCC 

51 TTCTTTAGTC AGGGCAGAGC TACAACCAAG TGAAAACAGA AAAGGGGGGT 

101 GGCCTACACA ACTTTCCTGT GCAGAAGGTT CGCAACTCTT CTGTAAATTC 

151 GAAGCTGCCT ATAATAATGC AATTGAGGAA GGGAAACCTG GGATTTTAGT 

201 CTTTTTCTCT GAGCGACCCA CACCAGAATT TGCCGACTTA ACGAATGGTT 

251 CATTTTCTCT CTCTACGCCA ATCGCCAAGG GCTTTAATGT CGTTGTGTTA 

3 01 TGCCCCGGGC TTATCAGTCC CTTAGACTTT TTCCACAAAA TGGATCCTGT 

351 GATTCTCTAT ATGGGAAGTT TTCTAGAGAT GTTCCCTGAA GTGGAGGCAG 

401 TTAGTGGCCC TCGCTTATGT TATATCTTAA TAGATGAACA GGGTGGGGCT 

451 CAATGTCAGG CTGTCCTGCC TTTAGAAACA AAGAATTAG 



The PSORT algorithm predicts inner membrane (0.1235). 

The following C.pneumoniae protein (pid 4377375) was also expressed <SEQ ID 353; cp7375>: 



1 MQRIIXVGID TGVGKTIVSA ILARALNAEY WKPIQAGNLE NSDSNIVHEL 

51 SGAYCHPEAY RLHKPLSPHK AAQIDNVSIE ESHICAPKTT SNIiIIETSGG 

101 FLSPCTSKRL QGDVFSSWSC SWILVSQAYL GSINHTCLTV EAMRSRNLNI 

151 IjGMWNGYPE DEEHWLTQEI KLPIIGTLAK EKEITKTIIS CYAEQWKEVW 

201 TSNHQGIQGV SGTPSLNLH* 



1 ATGCAACGTA TCATCATTGT AGGAATCGAC ACTGGCGTAG GAAAAACCAT 

51 TGTCAGTGCT ATCCTTGCTA GAGCACTTAA CGCAGAATAC TGGAAAC CTA 

101 TACAAGCAGG GAATCTAGAA AATTCAGATA GCAATATTGT TCATGAGCTA 

151 TCGGGAGCCT ACTGTCATCC CGAAGCTTAT CGATTGCATA AGCCCTTGTC 

201 TCCACACAAG GCAGCGCAAA TCGATAATGT AAGTATCGAA GAGAGTCATA 

251 TTTGTGCGCC AAAAACAACT TCGAATCTGA TTATTGAGAC TTCAGGAGGA 

301 TTTTTATCCC CCTGCACATC AAAAAGACTT CAGGGAGATG TGTTTTCTTC 

351 TTGGTCATGT TCTTGGATTT TAGTGAGCCA AGCATATC TC GGAAGTATCA 

401 ATCACACCTG TTTAACGGTA GAAGCAATGC GCTCACGAAA CCTCAATATC 

451 TTAGGTATGG TGGTAAATGG GTATCCAGAG GACGAAGAGC ACTGGCTAAC 

501 TCAAGAAATC AAGCTTCCTA TAATCGGGAC TCTTGCCAAG GAAAAAGAAA 

551 TCACAAAGAC AATCATAAGC TGTTATGCCG AACAATGGAA GGAAGTATGG 

601 ACAAGCAATC ATCAGGGAAT TCAGGGTGTA TCTGGCACCC CTTCACTCAA 

651 TCTGCATT AG 



The PSORT algorithm predicts cytoplasm (0.0049). 

The following C.pneumoniae protein (pid 4377388) was also expressed <SEQ ID 355; cp7388>: 



1 MQVLLSPQLP PPPQHSVGSI SSPSKLRVLA ITFLVFGMLL LISGALFLTL 

51 GIPGLSAAIS FGLG I Gh SAL GGVLMISGLL CLLVKREIPT VRPEEIPEGV 

101 SLAP SEE PAL QAAQKTLAQL PKELDQLDTD IQEVFACLRK LKESKYESRS 

151 FLNDAKKELR VFDFWEDTL SEIFELRQIV AQEGWDLNFL INGGRSLMMT 

201 AESESLDLFH VSKRLGYLPS GDVRGEGLKK SAKEIVARLM SLHCEIHKVA 

251 VAFDRNSYAM AEKAFAKALG ALEESVYRSL TQSYRDKFLE SERAKIPWNG 

301 H I TWLRDDAK SGCAEKKLRD AEERWKKFRK AVFWVEEDGG FDINNLLGDW 

351 GTVLDPYRQE RMDEITFHEL YEKTTFLKRL HRKCALAKTT FEKKRSKKNL 

401 QAVEEANARR LKYVRDWYDQ EFQKAGERLE KLHALYPEVS VSIRENKIQE 

451 TRSNIiEKAYE A I EENYRCC V REQEDYWKEE EKREAEFRER GNKILSPEEL 

501 ESSLEQFDHG LKNFSEKLME LEGHILKLQK EATAEVENKI LSDAESRLEI 

551 VFEDVKEMPC RIEEIEKTLR MAELPLLPTK KAFEKAC SQY NSCAEMLEKV 

601 KPYCKESLAY VTSKERLVSL DEDLRRAYTE CQKRFQGDSG LESEVRACRE 

651 QLRERIQEFE TQGLDLVEKE LLCVSSRLRN TECDCVSGVK KEAPPGKKFY 

701 AQYYDEIYRV RVQSRWMTMS ERLREGVQAC NKMLKAGLSE EDKVLKEEEY 

751 WLYREERKNK EKRLVGTKIV ATQQRVAAFE SIEVPEIPEA PEEKPSLLDK 

801 ARSLFTREDH T 



The cp7268 nucleotide sequence <SEQ ID 



352> is: 



The cp7375 nucleotide sequence <SEQ ID 



354> is: 



The cp7388 nucleotide sequence <SEQ ID 356> is: 



1 ATGCAAGTAC TTCTATCTCC GCAGCTACCC CCCCCCCCCC AACACTCTGT 
51 AGGGTCGATT TCTTCTCCAT CTAAACTTCG CGTTTTAGCG ATTACTTTTT 
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101 

151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 
2051 
2101 
2151 
2201 
2251 
2301 
2351 
2401 



TAGTTTTTGG 

GGGATTCCAG 

CTCCGCATTA 

TAAAACGAGA 

TCGCTGGCTC 

AGCTCAGCTG 

TGTTCGCATG 

TTTTTAAACG 

GGATACCCTC 

GATGGGATTT 

GCAGAATCTG 

TTTACCTTCT 

AGATAGTCGC 

GTAGCGTTTG 

AGCGTTGGGA 

ATAGAGATAA 

CATATAACCT 

GCTTCGGGAT 

GGGTAGAAGA 

GGGACAGTGC 

CCATGAGTTG 

GTGCGTTAGC 

CAGGCAGTCG 

GTATGATCAG 
CTTTGTATCC 

ACGCGC TCTA 
TTGCTGTGTC 
AAGCGGAGTT 
GAAAGTTCTT 
ATTAATGGAA 
CAGAGGTGGA 
GTATTTGAAG 
GACGCTGCGT 
AGAAGGCCTG 
AAGCCTTACT 
AGTGAGCTTG 
GATTCCAGGG 
CAACTGCGAG 
GGAAAAAGAG 
ATTGTGTATC 
GCCCAGTATT 
GACGATGTCT 
TGAAGGCAGG 
TGGTTGTATC 
TAAGATAGTA 
TTCCTGAGAT 
GCGCGTTCTT 



TATGCTCTTA 
GATTGAGTGC 
GGAGGAGTGC 
GATTCCGACA 
CTTCTGAGGA 
CCTAAGGAAT 
TTTAAGAAAG 
ATGCTAAGAA 
TCGGAGATTT 
AAACTTTTTG 
AATCGCTTGA 
GGGGATGTTC 
TCGTTTGATG 
ATAGGAATTC 
GCTTTAGAAG 
ATTTTTGGAG 
GGTTAAGAGA 
GCCGAGGAAC 
AGACGGGGGC 
TTGATCCTTA 
TATGAAAAAA 
GAAAACAACC 
AGGAGGCGAA 
GAGTTTCAGA 
TGAGGTTTCA 
ATTTAGAGAA 
CGAGAGCAAG 
TAGGGAGAGG 
TGGAGCAATT 
TTGGAAGGGC 
GAATAAAATA 
ATGTCAAGGA 
ATGGCGGAGC 
CTCACAATAT 
GCAAGGAGAG 
GATGAAGATT 
GGATTCGGGT 
AGCGGATCCA 
TTGCTTTGTG 
TGGTGTTAAG 
ATGATGAGAT 
GAGAGATTGA 
CCTAAGCGAA 
GAGAGGAGAG 
GCAACGCAGC 
TCCTGAGGCC 
TATTTACTCG 



CTGATTTCAG 
AGCAATTTCT 
TGATGATTTC 
GTACGACCAG 
GCCAGCTCTA 
TGGATCAGTT 
CTGAAAGATT 
GGAGCTTCGA 
TCGAGTTGCG 
ATCAATGGGG 
TTTGTTTCAT 
GAGGGGAGGG 
AGCTTGCATT 
CTATGCGATG 
AGAGTGTGTA 
AGC GAGAGGG 
TGATGCGAAG 
GTTGGAAGAA 
TTTGACATCA 
TAGACAAGAG 
CTACGTTTTT 
TTTGAAAAGA 
TGCACGTAGG 
AAGCAGGGGA 
GTCTCTATAA 
AGCCTATGAG 
AGGACTACTG 
GGAAACAAGA 
CGACCATGGT 
ATATCTTAAA 
CTTTCAGATG 
GATGCCCTGT 
TGCCCCTACT 
AATAGCTGCG 
CCTCGCCTAT 
TACGACGAGC 
TTGGAGTCGG 
AGAGTTTGAA 
TGAGTAGTAG 
AAAGAAGCAC 
TTATCGAGTT 
GAGAGGGAGT 
GAAGATAAGG 
AAAGAATAAA 
AG CGAGTTGC 
CCAGAGGAGA 
CGAGGACCAT 



GAGCTCTCTT 
T TTGGATT AG 
GGGACTACTA 
AAGAAATTCC 
CAGGCAGCTC 
AGATACAGAT 
CTAAGTATGA 
GTTTTTGACT 
GCAGATTGTG 
GACGAAGCCT 
GTATCGAAGC 
GTTAAAGAAA 
GCGAGATTCA 
GCAGAAAAGG 
TCGGAGTCTG 
CGAAGATCCC 
AGTGGGTGTG 
ATTTAGGAAA 
ATAATCTCCT 
AGAATGGACG 
GAAAAGACTG 
AGAGATCTAA 
TTGAAATATG 
GAGATTAGAG 
GAGAGAACAA 
GCTATCGAAG 
GAAAGAAGAA 
TTCTTTCTCC 
TTGAAAAATT 
ACTTCAGAAA 
CAGAGAGCCG 
CGAATTGAGG 
TCCTACGAAG 
CAGAGATGTT 
GTGACTAGCA 
CTACACAGAG 
AAGTAAGAGC 
ACTCAAGGGC 
ATTAAGAAAT 
CTCCTGGTAA 
AGAGTTCAAT 
TCAAGCATGC 
TTCTTAAAGA 
GAGAAACGTT 
AGCATTTGAA 
AACCGAGTTT 
ACCTAG 



TCTGACGTTA 
GCATCGGTCT 
TGTCTTTTAG 
TGAAGGGGTT 
AGAAGACTTT 
ATTCAGGAAG 
AAGTCGAAGT 
TTGTGGTTGA 
GCTCAAGAGG 
CATGATGACT 
GGCTAGGGTA 
TCTGCGAAGG 
CAAGGTGGCG 
CGTTTGCGAA 
ACGCAGAGTT 
ATGGAATGGG 
CTGAAAAGAA 
GCAGTCTTTT 
TGGAGACTGG 
AGATAACGTT 
CACAGAAAGT 
AAAGAATTTG 
TAAGGGATTG 
AAACTGCATG 
AATACAAGAG 
AGAACTATCG 
GAGAAAAGGG 
TGAGGAGCTG 
TTTCTGAGAA 
GAAGCCACAG 
CCTTGAGATT 
AGATAGAGAA 
AAGGCGTTTG 
GGAGAAGGTG 
AAGAGCGTTT 
TGTCAGAAGA 
CTGTCGAGAG 
TGGACTTGGT 
ACAGAGTGCG 
GAAGTTTTAT 
CCCGATGGAT 
AACAAGATGT 
AGAAGAGTAT 
TGGTTGGTAC 
TCCATAGAAG 
GC TGGATAAA 



The PSORT algorithm predicts inner membrane (0.461). 

The proteins were expressed in E.coli and purified as his-tag products (Figure 174: 7200=lanes 2-3; 
7236=lanes 4-5; 7268=lanes 6-8; 7375=lanes 9-10; 7388=lanes 11-12). The recombinant proteins 
were used to immunise mice, whose sera were used in Western blots (Figures 174, 175, 176, 177 & 
178) and for FACS analysis. 

These experiments show that cp7200, cp7235, cp7268, cp7375 & cp7388 are surface-exposed and 
immunoaccessible proteins and that they are useful immunogens. These properties are not evident 
from the sequence alone. 



Example 179 



The following C.pneumoniae protein (pid 4376723) was expressed <SEQ ID 357; cp6723>: 



WO 02/02606 



PCT/1B01/01445 



-181- 



1 MATSVAPSPV PESSPLSHAT EVLNLPNAYI TQPHPIPAAP WETFRSKLST 

51 KHTLCFALTL LLTLGGTISA GYAGYTGNWI ICGIGLGIIV LTLILALIaliA 

101 IPLKNKQTGT KLIDEISQDI SSIGSGFVQR YGLMFST IKS VHLPELTTQN 

151 QEKTRILNEI EAKKESIQNL ELKITECQNK LAQKQPKRKS SQKSFMRSIK 

201 HLSKNPVILF DC* 

The cp6723 nucleotide sequence <SEQ ID 358> is: 

1 ATGGCAACTT CCGTAGCCCC ATCACCAGTC CCCGAGAGCA GCCCTCTCTC 

51 TCATGCTACA GAAGTTCTCA ATCTTCCTAA TGCTTATATT ACGCAGCCTC 

101 ATCCGATTCC AGCGGCTCCT TGGGAGACCT TTCGCTCCAA AC TTTCCACA 

151 AAGCATACGC TCTGTTTTGC CTTAACACTA CTGTTAACCT TAGGGGGAAC 

201 GATCTCAGCA GGTTACGCAG GAT AT AC TGG AAACTGGATC ATCTGTGGCA 

251 TCGGCTTGGG AATTATCGTA CTCACACTGA TTCTTGCTCT TCTTCTAGCA 

301 ATCCCTCTTA AAAATAAGCA GACAGGAACA AAACTGATTG ATGAGATATC 

351 TCAAGACATT TCCTCTATAG GATCAGGATT TGTTCAGAGA TACGGGTTGA 

401 TGTTCTCTAC AATTAAAAGC GTGCATCTTC CAGAGCTGAC AACACAAAAT 

451 CAAGAAAAAA CAAGAATTTT AAATGAAATT GAAGCGAAAA AGGAATCGAT 

501 CCAAAATCTT GAGCTTAAAA TTACTGAGTG CCAAAACAAG TTAGCACAGA 

551 AACAGCCGAA ACGGAAATCA TCTCAGAAAT CATTTATGCG TAGTATTAAG 

601 CACCTCTCCA AGAACCCTGT AATTTTGTTC GATTGCTGA 

The PSORT algorithm predicts inner membrane (0.6095). 

The protein was expressed in Kcoli and purified as a his-tag product (Figure 179 A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
179B) and for FACS analysis. 

These experiments show that cp6723 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 



Example 180 

The following C.pneumoniae protein (pid 43 7 6749) was expressed <SEQ ID 359; cp6749>: 

1 MSYYFSLWYTj KVQQHFQAAF DFTRSLCSRI SNFALGVIAL LPIIGQLYVG 
51 LDWIiLSRIKK PEFPSDVDQI VRVEHWGHD HRSRVEDILK RQRLSLEPRD 
101 EGKVHGDLPS APFF* 

The cp6749 nucleotide sequence <SEQ ID 360> is: 

1 ATGAGTTATT ACTTTTCTCT TTGGTATCTG AAGGTGCAAC AGCACTTTCA 

51 AGCAGCATTT GATTTTACTC GCTCCCTGTG TTCACGAATT TCTAATTTTG 

101 CTTTGGGAGT GATTGCATTG CTTCCTATTA TTGGGCAGTT GTATGTAGGG 

151 CTGGACTGGC TCCTCTCTAG GATAAAAAAG CCAGAATTTC CTTCCGATGT 

201 GGATCAGATC GTGCGAGTAG AACACGTCGT GGGTCACGAC CATAGAAGTC 

251 GAGTTGAAGA TATTCTAAAG AGACAAAGGC TCTCATTAGA GCCTAGAGAC 

301 GAGGGGAAGG TTCACGGAGA TCTGCCTTCA GCTCCTTTTT TTTGA 

The PSORT algorithm predicts inner membrane (0.2996). 

The protein was expressed in Kcoli and purified as a his-tag product (Figure 180A). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
180B) and for FACS analysis. 

These experiments show that cp6749 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 
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Example 181 , 
Example 182 , 
Example 183 , 
Example 184 and 
Example 185 

The following ^pneumoniae protein (pid 4376301) was expressed <SEQ ID 361; cp6301>: 

1 LNQDLQNVYQ ECQKATGLES EVSAYRDHLR EQITEFETQG LDVIKEELLF 

51 VSSTLKSKLS YDPLIADIPC MKFYEEYYDG IDKARVQSRW LEKSERYRKA 

101 KKGFQEMLKE GLFKEDQALK KAEYRLLREK RMNKEKLLIC NKIEAAQQRV 

151 QBFGPSDS* 

The cp6301 nucleotide sequence <SEQ ID 362> is: 

1 TTGAATCAGG ATTTACAAAA TGTATACCAA GAGTGCCAGA AGGCTACAGG 

51 TTTAGAATCG GAAGTGAGTG CATATAGAGA TCATCTTAGA GAGCAGATCA 

101 CAGAGTTTGA AACTCAAGGG CTGGACGTGA TAAAAGAAGA ACTTCTTTTT 

151 GTGAGTAGTA CTCTCAAAAG TAAATTGAGC TATGATCCAT TAATAGCAGA 

201 CATTCCCTGT ATGAAGTTTT ATGAGGAGTA TTATGATGGC ATTGATAAAG 

251 CGAGAGTTCA ATCCCGATGG CTGGAGAAGT CTGAGAGGTA TAGAAAGGCG 

301 AAGAAGGGAT TCCAAGAGAT GCTGAAGGAA GGCCTATTCA AAGAAGATCA 

351 GGCTTTGAAA AAAGCAGAGT AT AGATT AC T TCGAGAGAAG AGAATGAATA 

401 AGGAGAAGCT TTTGATTTGC AATAAGATAG AAGCAGCTCA GCAGCGAGTC 

451 CAAGAATTTG GACCCTCGGA TTCATAA 

The PSORT algorithm predicts cytoplasm (0.4621). 

The following Cpnewnoniae protein (pid 4376558) was also expressed <SEQ ID 363; cp6558>: 

1 MNIPAPQVPV I DE PWNNT S SYGLSLKSSLr RPITYiilLAI L AI ATU4SVL 

51 YFCGIISVGT FVLGMLIPLS VCSVLCVAYL FYQQSSIEKT KVFSITSPSV 

101 FFSDEDLNLL LGREEDSVSA I DELLKNF PA DDFRRPKMLP YSNFLDEQGR " 

151 PNESREEDSH TSKIL* 

The cp6558 nucleotide sequence <SEQ ID 364> is: 

CCGCTCCCCA AGTACCAGTC ATAGATGAGC CTGTAGTGAA 
AGCTATGGTC TTTCATTGAA AAGTAGTTTA AGACCGATTA 
TTTAGCTATC TTAGCTATAG CCACACTGAT GTCTGTTCTC 
GCATCATTAG TGTTGGGACG TTTGTTTTGG GCATGCTGAT 
GTCTGCTCTG TTCTTTGCGT TGCCTATTTA TTCTATCAGC 
AGAAAAGACT AAGGTC TTT T CTATAACCAG TCCTTCAGTA 
ATGAGGATCT TAATTTACTC TTAGGTCGAG AAGAAGATTC 
ATTGATGAAC TTCTTAAGAA CTTTCCAGCT GATGATTTCC 
GATGCTTCCT TATTCAAATT TTCTAGATGA GCAGGGAAGG 
GTAGGGAAGA AGACTCTCAT ACTTCCAAGA TCTTATAA 

The PSORT algorithm predicts inner membrane (0.4630). 

The following ^pneumoniae protein (pid 4376630) was also expressed <SEQ ID 365; cp6630>: 

1 MSMTIVPHAL FKNHCECHST FPI*SSRTIVR IAIASLFCIG AIAALGCLAP 

51 PVSYIVGSVL AFIAFVILSL VIIiALIFGEK KLPPTPRIIP DRFTHVIDEA 

101 YGLSISAFVR EQQVTLAEFR QFSTALLCNI SPEEKIKQLP SELRSKVESF 

151 GISRLAGDLE KNNWPIFEDL LSQTCPLYWL QKFISAGDPQ VCRDLGVPRE 

201 CYGYYWLGPL GYSTAKAT I F CKETHHILQQ LTKEDVLLLK NKALQEKWDT 

251 DEVKAIVERI YTTYTARGTL KTEAGGLTKE TISKELLLLS LHGYSFDQLQ 

301 LITQLPRDAW DWLCFVDNST AYNLQLCALV GALSSQNLLD ESSIDFDVNL 

351 GLiYVIQDIjKE AVQAFSASDE PKKELGKFLL RHLSSVSKRL ESVLRQGLHR 

401 IALEHGNARA RVYDVNFVTG ARIHRKTSIF FKD* 

The cp6630 nucleotide sequence <SEQ ID 366> is: 

1 ATGAGCATGA CGATCGTTCC ACATGCTTTA TTTAAAAATC ATTGCGAGTG 

51 TCATTCTACC TTTCCTTTGA GTTCAAGGAC TATTGTAAGA ATAGCCATTG 

101 CCAGCCTCTT TTGTATAGGT GCATTAGCAG CTTTAGGCTG TTTGGCTCCT 

151 CCCGTTTCTT ATATTGTTGG GAGTGTTTTA GCTTTTATTG CCTTTGTCAT 

201 TCTTTCTTTA GTAATTTTAG CTTTGATTTT TGGAGAGAAG AAGCTTCCAC 



1 


ATGAACATAC 


51 


CAACACAAGT 


101 


CTTATTTGAT 


151 


TACTTTTGTG 


201 


CCCTCTATCG 


251 


AATCTTCTAT 


301 


TTTTTCTCTG 


351 


AGTGTCTGCA 


401 


GTAGGCCGAA 


451 


CCTAATGAGA 
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251 CAACACCAAG AATCATTCCT GATAGATTTA CTCACGTGAT AGATGAAGCT 

301 TATGGCCTTT CAATCTCTGC ATTTGTAAGA GAACAGCAGG TAACATTAGC 

351 CGAGTTTAGA CAATTTTCTA CTGCCCTGTT GTGTAACATA TCTCCTGAAG 

401 AGAAAATCAA ACAATTGCCT TCTGAATTGC GAAGTAAAGT AGAGAGTTTT 

451 G G T ATT AG CA GGCTCGCAGG TGATTTAGAA AAGAATAATT GGCCAATATT 

501 TGAAGATCTT TTAAGCCAAA CCTGCCCGTT ATATTGGCTT CAGAAATTTA 

551 TATCAGCAGG AGATCCACAA GTTTGTAGAG ACCTAGGTGT CCCTAGAGAA 

601 TGTTATGGGT ACTATTGGCT AGGGCCTTTG GGATACAGTA CAGCTAAGGC 

651 TACAATTTTT TGTAAAGAGA CGCATCATAT TCTTCAACAA TTAACGAAAG 

701 AGGACGTTCT TTTATTAAAA AACAAGGCTC TTCAAGAGAA ATGGGATACT 

751 GATGAAGTCA AAGCAATTGT AGAGCGTATC TAC AC TACCT ATACGGCACG 

801 AGGAACTCTA AAGACCGAAG CAGGGGGACT TACAAAAGAG ACAATCAGTA 

851 AGGAATTGCT ATTGTTGAGC TTGCATGGCT ATTCTTTTGA TCAGCTACAG 

901 CTGATCACTC AACTTCC TAG AGATGCTTGG GATTGGCTGT GTTTTGTAGA 

951 TAACAGTACC GCATACAACC TTCAGCTTTG TGCTCTTGTA GGAGCTTTGT 

1001 CATCCCAAAA TCTTCTTGAC GAATCTTCTA TCGATTTTGA TGTAAACCTA 

1051 GGCCTGTATG TGATTCAGGA TCTAAAAGAA GCTGTTCAAG CATTTTCTGC 

1101 TTCTGATGAG CCAAAGAAAG AACTAGGTAA ATTCTTGTTA AGGCATTTGA 

1151 GTTCAGTTTC TAAGCGATTA GAGAGTGTAT TAAGACAGGG TCTTCACAGA 

1201 ATAGCTCTAG AGCATGGAAA TGCCAGAGCT AGGGTTTATG ACGTCAATTT 

1251 TGTAACAGGA GCTAGAATTC ATAGGAAGAC GAGTATCTTC TTTAAAGACT 

1301 AA 

The PSORT algorithm predicts inner membrane (0.7092). 
. The following C.pnewnoniae protein (pid 4376633) was also expressed <SEQ ID 367; cp6633>: 

1 MVNIQPVYRN TQVNYSQATQ FSVCQPALSL IIVSWAAVL, AIVALVCSQS 

51 LLSIELGTAL VLVSLILFAS AMFMIYKMRQ EPKELLIPKK IMELIQEHYP 

101 SIWDFIRDQ EVSIYEIHHL ISILNKTNVF DKAPVYLQEK LLQFGIEKFK 

151 DVHPSKLPNF EEILtLQHCPIi HWLGRLVYPM VSDVTPGTYG YYWCGPLGLY 

201 ENAPSLFERR SLLIiLKKISF GEFALLEDGIi KKNTWSSSEL VQIRQNLFTR 

251 YYADKEEVDE AELNADYEQF DSLLHLIFSH KLS* 

The cp6633 nucleotide sequence <SEQ ID 368> is: 

1 ATGGTTAATA TACAGCCTGT GTATAGGAAT ACCCAAGTCA ACTATAGTCA 

51 GGCTACCCAA TTTTCGGTGT GCCAGCCAGC GCTTAGCCTG ATTATCGTTT 

101 CTGTTGTTGC TGCTGTACTC GCTATTGTAG CTTTGGTATG CAGTCAATCT 

151 CTTTTATCCA TAGAGTTAGG AACTGCTCTT GTTCTAGTTT CTCTTATTCT 

201 TTTTGCTTCT GCTATGTTTA TGATTTATAA GATGAGACAA GAACCTAAGG 

251 AGTTGCTGAT CCCTAAGAAA ATCATGGAAC TCATCCAAGA ACATTATCCA 

301 AGTATTGTTG TTGATTTTAT TAGAGATCAG GAGGTTTCCA TTTATGAGAT 

351 ACATCACTTG ATCTCTATTC TTAATAAGAC GAATGTTTTC GACAAAGCAC 

401 CAGTATATTT ACAAGAAAAA CTCTTACAGT TTGGCATTGA GAAGTTCAAA 

451 GATGTACATC CAAGTAAGCT CCCTAATTTT GAAGAAATTC TTCTACAGCA 

501 TTGCCCATTG CATTGGTTGG GACGTCTGGT ATATCCCATG GTATCGGATG 

551 TCACTCCAGG AACCTATGGA TACTATTGGT GTGGTCCTTT AGGACTGTAC 

601 GAGAACGCTC CCTCTCTTTT TGAACGTCGA TCTCTTCTAT TGTTAAAGAA 

651 AATTAGCTTT GGAGAGTTTG CTCTTTTAGA AGATGGTCTC AAGAAAAACA 

701 CGTGGAGTTC TTCGGAACTC GTTCAAATCA GACAAAACCT TTTTACAAGA 

751 TATTATGCTG ATAAAGAAGA GGTAGATGAA GCAGAGTTAA ACGCTGATTA 

801 CGAACAGTTT GATTCCCTCC TTCACCTTAT TTTTTCTCAC AAGCTCTCTT 

851 GA 

The PSORT algorithm predicts inner membrane (0.7283). 

The following C.pneumoniae protein (pid 437 6642) was also expressed <SEQ ID 369; cp6642>: 

1 MATISPISLT VDHPLVDTKK KSCSNFDKIQ SRILLITAIF AVLVTIGTLL 

51 IGLLIiNIPVI YFLTGISFIA WLSNFILYK RATTIiLKPRA CGKHKEIKPK 

101 RVSTNLQYSS ISIAINRSKE NWEHQPKDLQ NLPAPSALI/T DNPYEIWKAK 

151 HSLFSLVSUj PGGNPEHLLI SASENLGKTL LIEETSQNAP ISSYVDTTPS 

201 PKSLIiNEAIQ ETRVEINTE1* PAGDSGERLY WQPDFRGRVF LPQIPTTPEA 

251 IYQYYYALYV TYIQTAINTN TQIIQIPLYS LREHLYSREL PPQSRMQQSI* 

301 AMITAVKYMA ELHPEYPLTI ACVERSLAQL PQESIEDLS* 

The cp6642 nucleotide sequence <SEQ ID 370> is: 



1 ATGGCTACAA TCTCACCCAT ATCTTTAACT GTAGATCATC CCCTAGTAGA 
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51 CACTAAAAAA AAATCCTGCA GCAACTTTGA TAAGATTCAG TCTCGAATTC 

101 TATTGATTAC TGCAATCTTT GCTGTCTTAG TTACTATAGG GACCCTACTT 

151 ATTGGTTTGC TTTTAAATAT TCCTGTTATC TATTTCCTCA CAGGAATTTC 

201 ATTTATTGCT GTTGTTCTTA GC AACTTTAT CCTTTATAAA CGAGCAACCA 

251 CCCTCTTAAA ACCGCGTGCT TGTGGCAAAC ACAAAGAAAT AAAACCAAAA 

301 AGGGTCTC C A CCAACCTACA GTATTCTTCT ATCTCTATCG CAATCAATCG 

351 TTCTAAAGAA AACTGGGAAC ACCAACCCAA GGACCTACAG AATCTCCCCG 

401 CACCCTCTGC ATTACTCACA GATAACCCTT ACGAGATATG GAAAGCTAAA 

451 CATTCACTGT TTTCCCTAGT ATCCCTCCTA CCGGGAGGCA ATCCAGAACA 

501 TCTCTTAATT TCAGCTTCCG AAAATTTAGG AAAGACTCTG TTAATTGAAG 

551 AAACCTCGCA AAATGCGCCT ATATCCTCCT ACGTAGATAC CACTCCCTCC 

601 CCAAAATCCT TGCTCAATGA GGCAATTCAG GAAACCAGGG TAGAAATAAA 

651 TACAGAACTC CCTGCGGGAG ATTCAGGAGA ACGTTTATAC TGGCAACCCG 

701 ATTTCCGAGG CCGCGTCTTC CTCCCACAAA T AC CAACAAC TCCTGAAGCC 

751 ATCTACCAAT ACTACTATGC ACTCTATGTC ACTTATATCC AGACTGCGAT 

801 CAATACGAAC ACCCAAATTA TCCAAATCCC TTTATACAGC TTGAGGGAGC 

851 ATCTCTATTC TAGAGAATTG CCCCCGCAAT CAAGAATGCA ACAATCTTTG 

901 GCTATGATTA CAGCAGTAAA ATACATGGCC GAGCTGCACC CAGAATATCC 

951 GCTAACTATT GCTTGTGTTG AAAGATCCTT AGCCCAACTA CCTCAAGAAA 

1001 GTATTGAGGA TCTCTCTTAG 

The PSORT algorithm predicts inner membrane (0.5288). 

The proteins were expressed in E.coli and purified as GST-fusion products. The recombinant 
proteins were used to immunise mice, whose sera were used in Western blots (Figures 181-185) and 
for FACS analysis. 

These experiments show that cp6301, cp6558, cp6630, cp6633 and cp6642 are surface-exposed and 
immunoaccessible proteins, and that they are useful immunogens. These properties are not evident 
from their sequences alone. 

Example 186 

The following C. pneumoniae protein (PID 43763 89) was expressed <SEQID 371; cp6389>: 

1 MSBVKPLFLK NDSFDLATQR FQNLINMLQE QAEIYNEYEE KNARVQNEIK 

51 EQKDFVKRCI EDFEARGLGV LKEELASLTR DFHDKAKAET SMLIECPCIG 

101 FYYSIHQEEQ RQRQERLQKM AERYRDCKQV LEAVQVEQKD MISSRWVDD 

151 SYFEEEKEEQ KVDNRKKEQD * 

The cp6389 nucleotide sequence <SEQ ID 372> is: 

1 ATGTCAGAAG TGAAGCCTTT GTTTTTAAAG AATGACTCTT TTGATTTGGC 

51 AACTCAGAGA TTCCAGAATC TAATTAACAT GCTACAAGAG CAAGCCGAGA 

101 TATATAACGA GTATGAAGAA AAGAATGCTA GGGTTCAGAA TGAGATTAAG 

151 GAGCAAAAGG ACTTTGTGAA AAGATGCATA GAGGACTTTG AAGCCAGAGG 

201 ACTGGGGGTG CTAAAAGAAG AGCTTGCATC TTTGACGCGT GATTTCCATG 

251 ATAAAGCAAA AGCAGAGACT TCTATGCTCA TTGAATGTCC TTGTATTGGT 

301 TTTTATTATA GTATTCATCA GGAGGAACAA AGGCAAAGGC AAGAAAGGCT 

351 TCAAAAGATG GCTGAGCGCT ATAGGGACTG TAAACAAGTC TTGGAGGCTG 

401 TCCAGGTGGA GCAAAAAGAT ATGATATCTT CTAGAGTCGT TGTCGATGAC 

451 AGCTACTTTG AAGAAGAAAA AGAAGAACAA AAGGTGGATA ACAGAAAGAA 

501 AGAACAGGAC TAG 

The PSORT algorithm predicts cytoplasm (0.3 1 93). 

The protein was expressed in E,coli and purified as a GST-fusion product (Figure 186A) and also in 
his-tagged form. The recombinant proteins were used to immunise mice, whose sera were used in a 
Western blot (Figure 186B) and for FACS analysis. 
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These experiments show that cp6389 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 187 

The following C.pneumoniae protein (PID 4376792) was expressed <SEQ ID 373; cp6792>: 

1 VLQEHFFLSE DVITLAQQLL GHKL I TTHEG LITSGYIVET EAYRGPDDKA 

51 CHAYNYRKTQ RNRAMYLKGG SAYLYRCYGM HHLLNWTGP EDIPHAVLIR 

101 AIL PDQGKEL MIQRRQWRDK PPHLLTNGPG KVCQALGISL ENNRQRLNTP 

151 ALYISKEKIS GTLTATARIG I DYAQE YRDV PWRFLLSPED SGKVLS* 

The cp6792 nucleotide sequence <SEQ ID 374> is: 

1 GTGCTACAAG AACATTTTTT TCTATCGGAA GATGTAATTA CACTAGCGCA 

51 ACAGCTTTTA GGACATAAAC TCATCACAAC ACATGAGGGT CTGATAACTT 

101 CAGGTTACAT TGTAGAAACC GAAGCGTATC GTGGCCCTGA TGACAAAGCA 

151 TGCCACGCCT ACAACTACAG AAAAAC TC AG AGGAACAGAG CGATGTACCT 

201 GAAAGGAGGC TCTGCTTACC TCTACCGTTG CTATGGCATG CATCACCTAT 

251 TGAATGTTGT CACTGGACCT GAGGACATTC CCCATGCCGT CCTGATCCGG 

301 GCCATCCTTC CTGATCAAGG CAAAGAACTT ATGATCCAAC GCCGCCAATG 

351 GAGAGATAAA CCCCCACACC TTCTCACCAA TGGACCCGGA AAAGTGTGCC 

401 AAGCTCTAGG AATCTCTTTG GAAAACAATA GGCAACGCCT AAATACCCCA 

451 GCTCTCTATA TCAGCAAAGA AAAAATCTCT GGGACTCTAA CAGCAACTGC 

501 CCGGATCGGC ATCGATTATG CTCAAGAGTA TCGTGATGTC CCATGGAGAT 

551 TTCTCCTATC CCCAGAAGAT TCGGGAAAAG TTTTATCTTA A 

The PSORT algorithm predicts cytoplasm (0.180). 

The protein was expressed in Kcoli and purified as a his-tagged product (Figure 187 A; lanes 2-4). 
The recombinant protein was used to immunise mice, whose sera were used in a Western blot 
(Figure 187B) and forFACS analysis. 

These experiments show that cp6792 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 

Example 188 

The following C.pneumoniae protein (PID 4376868) was expressed <SEQ ID 375; cp6868>: 

1 MVETVLHNFQ RYLSKYLYRV FRFPCRKKTF LSSHRVLARP SFPVDYCPGK 

51 IYDLQEIYEE LNAQLFQGAL RLQIGWFGRK ATRKGK SWL GLFHENEQLI 

101 RIHRSLDRQE IPRFFMEYLV YHEMVHSWP REYSLSGRSI FHGKKFKEYE 

151 QRFPLYDRAV AWEKANAYLL RGYKKRVGGG YGRA* 

The cp6868 nucleotide sequence <SEQ ID 376> is: 

1 ATGGTTGAAA CAGTACTTCA TAATTTCCAA CGTTATCTGA GCAAGTATCT 

51 CTATAGGGTA TTTCGCTTCC CATGTCGTAA AAAGACGTTC CTATCTTCGC 

101 AC AGGGTTCT TGCTCGTCCT TCATTCCCAG TAGACTACTG TCCGGGAAAG 

151 ATCTATGATT TGCAGGAGAT CTATGAGGAA TTGAATGCGC AGTTATTTCA 

201 AGGTGCACTG CGTTTACAGA TTGGTTGGTT CGGAAGGAAA GCTACCAGAA 

251 AAGGCAAGAG TGTTGTCTTG GGATTGTTTC ATGAAAATGA AC AGT TAATT 

301 CGAATTCATC GTTCTTTAGA TCGGCAGGAA ATCCCAAGAT TTTTTATGGA 

351 ATATCTTGTG TATCATGAAA TGGTTCATAG TGTAGTCCCT AGAGAGTATT 

401 CTCTATCGGG GCGTTCGATT TTTCATGGTA AAAAGTTTAA AGAATACGAA 

451 CAACGTTTCC CCTTGTATGA TCGTGCTGTT GCTTGGGAAA AGGCAAACGC 

501 TTATTTATTG CGAGGGTATA AAAAAAGAGT AGGTGGAGGA TATGGCAGGG 

551 CATAG 

The PSORT algorithm predicts bacterial cytoplasm (0.325). 
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The protein was expressed in Rcoli and purified as a his-tag product (Figure 188 A; lanes 2-3). The 
recombinant protein was used to immunise mice, whose sera were used in a Western blot (Figure 
1 88B) and for FACS analysis. 

These experiments show that cp6868 is a surface-exposed and immunoaccessible protein, and that it 
is a useful immunogen. These properties are not evident from the sequence alone. 



Example 189 



The following C.pneumoniae protein (PID 4376894) was expressed <SEQ ID 377; cp6894>: 



l 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 



MYKRCVXjDKI 
SRWKQQQTS 
QQTLPELLGT 
SPHVGKYEEF 
PKHVQLDEVF 
VSVENDLKXiV 
FANGEK I I ED 
IVFSRNPDFY 
DNFYSFMKSS 
CAMNMAIDRE 
RLLEEEGWID 
ACKEIGIECS 
EGAMEKGSAN 
PYAFLFSRHC 
DPCLSTS* 



LKGIVAGSLI 
QAIPAAPGVM 
NFHPHGILRT 
SPDLAVKIEE 
QRPHPVTAHD 
VRWKAHTVIN 
ENIDTYRTNS 
DPLAALIDKR 
AYNKQVAKGG 
RIIEQCLDGQ 
TDGDG I REKV 
LLGLDMADI) S 
WGFHNEEAD 
SLLYKDYVKN 



LLYWSSDLLE 
LAPKIiVRDEA 
AHVGKPENLS 
HIiVEDGSGDK 
IKFFYDAVMN 
EEGKEERKVL 
IWAQNFTMHW 
FVYFKESTDS 
AVRETV S ADR 
GYTISGPFAS 
IDGVIVPFRF 
QAFDEKNFDA 
KIIDRLSYEY 
IFVPTHRTDL 



RDIKSIKGNV 
FALLFGDPSY 
PFNGFDYWG 
EFHIYLRPNV 
PYVATMRAVA 
YSAFSNTLSL 
ANNY IVSCGA 
LFQDFKTGKI 
AYTYIGWNCF 
SSPSYNKQIE 
RLCYYVKSVT 
LLMGWCLGIP 
DLKERNRLYH 
I PEAQDETVN 



RDIQEDIREI 
PNLLSLDPYK 
FYDIfCIPSLA 
FWRP IDPKAL 
LRSCYEDWS 
QPLPRFVYQY 
YYFAGMDDEK 
DISYLPPNQR 
SliFFQSRQVR 
GWHYSPEEAA 
AHT I ADYVAT 
PEDPRALWHS 
RFHEIIHEEA 
VTMVWKEKKE 



The cp6894 nucleotide sequence <SEQ ID 378> is: 



l 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 



ATGTATAAAA 
TTCTTTAATT 
AGTCGATAAA 
TCACGCGTAG 
TGGGGTGATG 
TCTTTGGAGA 
CAGCAGACTC 
CCTACGCACT 
GCTTTGATTA 
TCTCCCCACG 
AATAGAAGAA 
TCTATCTGAG 
CCAAAACACG 
AGCTCATGAT 
CAACCATGCG 
GTCTCAGTAG 
GGTAATCAAT 
TTTCTAATAC 
TTTGCTAACG 
AACCAATTCC 
ATATTGTAAG 
ATCGTGTTTT 
TGACAAGCGT 
ATTTTAAGAC 
GAT AATTTC T 
TAAGGGAGGA 
ACATAGGATG 
TGTGCTATGA 
GGATGGCCAA 
CTTATAATAA 
CGTCTCCTGG 
AGAAAAAGTT 
ATTATGTAAA 
GCTTGTAAGG 
CGATCTTTCG 
GATGGTGTTT 



GATGTGTGCT 
TTGTTATACT 
AGGTAACGTA 
TGAAACAACA 
CTCGCTCCTA 
TCCTAGTTAT 
TTCCTGAACT 
GCCCATGTCG 
TGTCGTGGGC 
TAGGGAAATA 
CATCTTGTTG 
GCCGAATGTT 
TTCAGTTAGA 
ATTAAGTTTT 
AGCAGTGGCT 
AAAACGATTT 
GAAGAAGGAA 
CTTAAGCTTG 
GGGAAAAAAT 
ATTTGGGCGC 
TTGTGGAGCC 
CTAGAAATCC 
TTCGTC T ATT 
AGGGAAAATA 
ATAGTTTTAT 
GCCGTCCGTG 
GAATTGCTTT 
ACATGGCAAT 
GGCTATACGA 
ACAGATCGAA 
AAGAAGAGGG 
ATCGATGGTG 
GAGTGTCACC 
AAATCGGAAT 
CAAGCTTTTG 
AGGAATTCCT 



AGATAAAATT 
GGTCCTCAGA 
AGAGATATTC 
GCAGACATCA 
AGCTCGTCAG 
CCTAATTTAC 
TCTAGGAACA 
GAAAACCCGA 
TTTTACGATC 
CGAAGAATTT 
AAGATGGTTC 
TTTTGGCGTC 
CGAAGTATTT 
TCTACGACGC 
CTGCGCTCTT 
AAAATTAGTA 
AGGAAGAGCG 
CAGCCCCTCC 
CATTGAAGAT 
AAAACTTCAC 
TACTACTTTG 
TGACTTCTAT 
TTAAGGAAAG 
GACATCTCTT 
GAAAAGCTCC 
AAACAGTCTC 
TCATTATTTT 
CGATAGAGAG 
TTAGTGGGCC 
GGGTGGCATT 
ATGGATAGAT 
TGATTGTCCC 
GCTCATACCA 
CGAGTGTAGC 
ATGAAAAGAA 
CCTGAGGATC 



TTAAAGGGGA 
CCTACTTGAA 
AAGAAGACAT 
CAAGCTATCC 
AGACGAAGCT 
TTTCCCTAGA 
AATTTCCACC 
AAATCTGAGC 
TCTGTATTCC 
TCTCCAGATC 
TGGGGATAAA 
CTATAGATCC 
CAACGTCCTC 
TGTTATGAAC 
GTTATGAAGA 
GT CAGATGG A 
CAAAGTGCTC 
CTAGATTTGT 
GAGAATATCG 
TATGCATTGG 
CAGGGATGGA 
GATCCTCTTG 
CACAGACTCC 
ACCTTCCACC 
GCTTATAACA 
AGCAGATCGA 
TCCAAAGCCG 
AGGATTATCG 
TTTTGCTTCG 
ATTCTCCAGA 
ACCGATGGCG 
GTTCCGTTTC 
TTGCAGATTA 
CTTCTAGGAC 
TTTCGATGCT 
CTAGGGCTTT 



TTGTCGCCGG 
AGAGACATTA 
TCGTGAAATC 
CTGCGGCACC 
TTTGCTCTAC 
CCCCTATAAA 
CTCATGGTAT 
CCTTTTAATG 
TAGTTTAGCT 
TCGCTGTGAA 
GAGTTTCACA 
TAAGGCCCTT 
ATCCTGTGAC 
CCTTATGTAG 
TGTGGTTTCT 
AAGCACACAC 
TACTCTGCAT 
ATATCAGTAT 
ATACCTACCG 
GCAAACAACT 
TGATGAGAAA 
CGGCTCTTAT 
CTATTCCAAG 
CAACCAAAGA 
AACAGGTAGC 
GCATATACGT 
ACAGGTGCGC 
AACAGTGCTT 
AGTTCTCCTT 
AGAAGCAGCT 
ATGGAATCCG 
CGTTTATGCT 
CGTAGCTACT 
TAGATATGGC 
CTTTTAATGG 
ATGGCATTCT 
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1801 GAAGGGGCTA TGGAAAAGGG TTCAGCGAAT GTTGTAGGTT TCCATAATGA 

1851 AGAAGCTGAT AAAATCATAG ACAGACTCAG CTACGAATAC GATCTGAAAG 

1901 AACGTAATCG CCTGTACCAC CGTTTCCATG AAATTATTCA TGAGGAAGCT 

1951 CCTTATGCTT TCTTGTTCTC ACGACATTGT TCCTTACTTT ATAAGGATTA 

2001 TGTAAAAAAT ATTTTCGTAC CTACACATAG AACAGATTTA ATTCCTGAAG 

2051 CTCAGGATGA GACTGTCAAC GTAACTATGG TATGGCTTGA GAAGAAGGAG 

2101 GATCCGTGCT TAAGTACATC CTAA 

The PSORT algorithm predicts inner membrane (0.162). 

The protein was expressed in E.coli and purified as a his- tag product (Figure 189 A) and also in 
GST/his form. The recombinant proteins were used to immunise mice, whose sera were used in a 
Western blot (Figure 189B) and for FACS analysis. 

These experiments show that cp6894 is a surface-exposed and immunoaccessible protein, and that it 
is a useful irrununogen. These properties are not evident from the sequence alone. 

Example 190 

The following C.pneumoniae protein (pid 4377193) was identified in the 2D-PAGE experiment 
<SEQID379;cp7193>: 

1 MKRVIYKTIF CGLTLLTSLS SCSLDPKGYN LETKNSRDLN QESVILKENR 

51 ETPSLVKRLS RRSRRLFARR DQTQKDTLQV QANFKTYAEK ISEQDERDLS 

101 FWSSAAEKS SISLALSQGE I KDAI» YRI RE VHPLALIEAI* AENPALIEGM 

151 KKMQGRDWIW NLFDTQLSEV FSQAWSQGVI SEEDIAAFAS TLGLDSGTVA 

201 SIVQGERWPE LVDIVIT* 

A predicted leader peptide is underlined. 

The cp7193 nucleotide sequence <SEQ ED 380> is: 

1 ATGAAAAGAG TCATTTATAA AAC CAT AT TT TGCGGGTTAA CTTTACTTAC 
51 AAGTTTGAGT AGTTGTTCCC TGGATCCTAA AGGATATAAC CTAGAGACAA 
101 AAAACTCGAG GGACTTAAAT CAAGAGTCTG TTATACTGAA GGAAAACCGT 
151 GAAACACCTT CTCTTGTTAA GAGAC TCTCT CGTCGTTCTC GAAGACTCTT 
201 CGCTCGACGT GATCAAACTC AGAAGGATAC GCTGCAAGTG CAAGCTAACT 
251 TTAAGACCTA CGCAGAAAAG ATTTCAGAGC AGGACGAAAG AGACCTTTCT 
301 TTCGTTGTCT CGTCTGCTGC AGAAAAGTCT TCAATTTCGT TAGCTTTGTC 
351 TCAGGGTGAA ATTAAGGATG CTTTGTACCG TATCCGAGAA GTCCACCCTC 
401 TAGCTTTAAT AGAAGCTCTT GCTGAAAACC CTGCCTTGAT AGAAGGGATG 
451 AAAAAGATGC AAGGCCGTGA TTGGATTTGG AATCTTTTCT TAACACAATT 
501 AAGTGAAGTA TTTTCTCAAG CTTGGTCTCA AGGGGTT AT C TCTGAAGAAG 
551 ATATCGCCGC ATTTGCCTCC ACCTTAGGTT TGGACTCCGG GACCGTTGCG 
601 TCCATTGTCC AAGGGGAAAG GTGGCCCGAG CTTGTGGATA TAGTGATAAC 
651 TTAA 

The PSORT algorithm predicts periplasmic (0.925). 

This shows that cp7193 is an immunoaccessible protein in the EB and that it is a useful immunogen. 
These properties are not evident from the protein's sequence alone. 



It will be appreciated that the invention has been described by way of example only and that 
modifications may be made whilst remaining within the spirit and scope of the invention. 



ft 
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TABLE II - sequences of the primers used to amplify Cpn genes. 



OrfJD 


N-terminus final primer 


C-ternunus final primer 


CP0014P 


GCGTC CCG GGTCATATG AAGTCTTCTTTCCCCA 


GCGT CTC GAG ATGAAAGAGTTTTTGCG 


CP0015P 


GCGTCCCGGGTCATATG TCAGCTCTGTTTTCTGA 


GCGT CTC GAG GAATTGGTATTTTGCTC 


CP0016P 


GCGTCCCGGGTCATATG GC CGATCTCACATT AG 


GCGT CTC GAG GTCCAAGTTAAGGTAGCA 


CP0017P 


GCGT CCG GGTCATATG GGTATCAAGGGAACTG 


GCGT CTC GAG AAATCCGAATCTTCC 


CP001 9P 


GCGTCCCGGGTCAT ATGCAAGACTCTCAAGACTATAG 


GCGT CTC GAG AAATCGGTATTTACCC 


CP6260P 


GCGTC CCG GGT GCTAGCACTACGATTTCTTTAACCC 


GCGT CTC GAG AAAACGAAATTTGCTTC 


CP6397P 


GCGTC CCG GGTC AT ATGTTTAAACTGCT AAAAAAT C T ATT 


GCGT CTC GAG ATGAAAGAAGAGTCCTCG 


CP6456P 


GCGTC CCG GGT CATATG TCATCTCCTGTAAATAACA 


GCGT CTC GAG CTGACCATCTCCTGTT 


CP6466P 


GCGTC CCG GGT CAT ATG TGCAAGGAGTCCAGT 


GCGT CTC GAG ATTTTCCTTAGCATAACG 


CP6467P 


GCGTC CCG GGT CAT ATG TGTTCCCCATCCCAA 


GCGT CTC GAG TAGTTTTTCTATAAAACGAAAGTCT 


CP6468P 


GCGTC CCG GGT CAT ATG TGCTCCTCCTACTCTTC 


GCGT CTC GAG GGGGAAATAGGTATATTTGA 


CP6469P 


GCGTC CCG GGT CAT ATG AGCTGCTCAAAGCAA 


GCGT CTC GAG ACTTAAGATATCGATATTTTTGA 


CP6552P 


GCGTC CCG GGT CAT ATG TGCCATAAGGAAGATG 


GCGT CTC GAG ACCATTGTCTTGAGTCAT 


CP6567P 


GCGTC CCG GGT CAT ATG ACCTCACCCATCCCC 


GCGT CTC GAG AGAACCCGGTAGACGG 

www X w i w w-rlw- nunrtu w wVjw X nvjrtuVj w 


CP6576P 


GCGTC CCG GGT CAT ATG ACTGAAAAAGTTAAAGAAGG 


GCGT CTC GAG GAA CATGCCCCCTAA 


CP6727P 


GCGTC CCG GGT CATATGCTACATCCACTAATGGC 


GCGT CTC GAG GAAAGAATAACGAGTTCC 


CPR72QP 


GCGTC CCG GGT CAT ATGGCAGATGCTTCTTTATC 

www X w V_ V, wAj X wA X f\ luu\.AUii X ww X X w X X 


GCGT CTC GAG GA ATTiaRT'fi'PCTTAGr'C 
www X w X w wtrvw? UlUtlUnUlAlVi InuUv. 


CP6731P 


GCGTC CCG GGT CATATGGCTGTTGTTGAAATCAAT 


gcgt*c cat ggc ggc cgc fiaaPTnfiAnrTTirr'prr 

OwwXw Lx* X www www \— w w uriAv. X wwfXflw 1 IMvv. 1 w w 


CPS736P 

wi \j i our 


GCGTC CCG GGT GCT AGCGTAGAAGTTATCATGCCTT 


GCGTC CAT GGC GGC CGC A & ATCCPA ATTTVCFTV 
www X w w^V X www wvjw jUuilv,ulAnHi\]^li\> 


CP6737P 


GCGT CGA TCC CAT ATG GAGACTAGACTCGGAGG 


www 1 wXw uftb AAAlulviuAl 1 1 


CP6751P 


GCGTC CCG GGT GCT AGC AATGAAGGTCTCCAACT 


GCGT CTC G&G A A ATCWATTCTACPPGC 
www X w 1\- KjrWj MfUil^lA-nl X w X /\w X www 


CP6752P 


GCGTGA ATT CAT ATGTTCGGGATGACTCCT 


GCGT CTC fZUCS GAATTTTAARnTAf TTCCTCl 
www X w 1A. uno uAA X X X X nnuVl X £\ w X 1W# XAJ 


CP6753P 


GCGTC CCG GGT GCT AGCACTCCCTACTCTCATAGAG 


GCGT CTC GAG AAACTTAAAGGTCGTTC 


CP6767P 


GCGTC CCG GGT CAT ATG ATAAAACAAATAGGCCGT 


GCGT CTC GAG TTVGTAAGCA ACTTCAGA 

www X w X w onL? X X V,VJ X ArtVjLnnU X X w-r\oxi 


CP6829P 


GCGTC CCG GGT CAT ATG AAGCAGATGCGTCTTT 


GCGTC CAT GGC GGC CGC GAAACTAAGGGAGAGGC 

wwwxw wAi www www WWW vattftii^ InnvjuununuuL 


CP6830P 


GCGTC CCG GGT CAT ATG GATCCCGCGTCTGTT 


GCGTC CAT GGC GGC CGC GAATACAAACCGCATCC 

w WVj X w wrvx WWW www www nVyrt/UlV^uun X w w 


CP6832P 


GCGTC CCG GGT CAT ATG CATAAAGTAATAGTTTTCATTT 


GCGT CTC GAG TAAACTAGAAAAAGTCGTC 


CP6848P 


GCGTC CCG GGT CAT ATG TCATCAAATCTACATCCC 


GCGT CTC GAG AACGCGAGCTATTTTAC 


CP6849P 


GCGTC CCG GGT GCT AGC AGCGGGGGTATAGAG 


GCGT CTC GAG ATACACGTGGGTATTTTC 


CP6850P 


GCGTC CCG GGT CAT ATG TGCCGCATTGTAGAT 


GCGT CTC GAG CTGTTTGCATCTGCC 


CP6854P 


GCGTC CCG GGT GCT AGC TCAATAGCTATTGCAAG 


GCGT CTC GAG TTATCGAAATGTCTTTG 


CP6879P 


GCGTC CCG GGT CAT ATG GCAACACCCGCTCAA 


GCGTC CAT GGC GGC CGC TCCTTGAAATTGCTCTTGC 


CP6894P 


GCGTC CCG GGT CAT ATG TATAAAAGATGTGTGCTAGA 


GCGT CTC GAG GGATGTACTTAAGCACG 


CP6900P 


GCGTC CCG GGT CAT ATG AAGATAAAATTTTCTTGGAAG 


GCGT AAG CTT GGGAAGACGATACCG 


CP6952P 


GCGTC CCG GGT CAT ATG CTCTCGGATCAATATATAGG 




CP7034P 


GCGTC CCG GGT CAT ATG AAAAAACAGGTATATCAATG 


GCGT AAG CTT AAACGCTGAAATTATACC 


CP7090P 


GCGTC CCG GGT CAT ATG TGTAGCCTTTCCCCT 


GCGT CTC GAG GCGTGCATGAATCTTA 


CP7091P 


GCGTC CCG GGT CAT ATG GAAGAATTAGAAGTTGTTGT 


GCGT CTC GAG TAGTGTTCTCTTTATCGGT 


CP7170P 


GCGTC CCG GGT CAT ATG CTAGGGGCTGGAAACC 


GCGT AAG CTT AAACTGCAGACCTGACG 


CP7228P 


GCGTC CCG GGT CAT ATG ACTGCTGTTCTTATTCTTACA 


GCGT CTC GAG ATCTGAAAGCGGAGG 


CP7249P 


GCGTC CCG GGT CAT ATG ATCCCATCCCCTACC 


GCGT CTC GAG ATCAGGTTGCTGAGACTT 


CP7250P 


GCGTC CCG GGT CAT ATG AATCTTTCAAACAGGTCT 


GCGT CTC GAG ATTTTTTCTAGAGAGACTCTC 


CP0018P 


GTGCGT CATATG GCAACCACTCCACTAA 


ACTCGCTA GCGGCCGC TAATGAGGTCCCCAG 


CP6270P 


GTGCGT CATATG AATTTATTAGGAGCTGCT 


ACTCGCTA GCGGCCGC AAATTTGATTTTGCTACC 


CP6735P 


GTGCGT CATATG GCAGCACAAGTTGTATAT 


ACTCGCTA GCGGCCGC TGGCGTAGAAGTGATC 


CP6998P 


GTGCGT CATATG TTGCCTGTAGGGAAC 


ACTCGCTA GCGGCCGC GAATCTGAACTGACCAGA 


CP7033P 


GTGCGT CATATG GTTAATCCTATTGGTCCA 


ACTCGCTA GCGGCCGC TTGGAGATAACCAGAATATA 


CP7287P 


GTGCGT CATATG TTACACAGCTCAGAACTAGA 


ACTCGCTA GCGGCCGC GAAAATAATACGGATACCA 


CP0010P 


GTGCGT CATATG GCAACTGCTGAAAATATA 


GCGT CTCGAG GAATTGGAACTTACCC 


CP0468P 


GTGCGT GCTAGC ATTTTTTATGACAAACTCTAT 


GCGT CTCGAG AAATGTGCAATGACTCT 


CP6272P 


GTGCGT CATATG TTGACTCATCAAGAGGCT 


GCGT CTCGAG GAAGGGAGGTTTTTTAGGT 


CP6273P 


GTGCGT CATATG AC AT ATCTGG AAG CTC 


ACTCGCTA GCGGCCGC CTCCACAATTTTTATG 


CP6362P 


GTGCGT CATATG CCCTTTGATATTACTTATTATACA 


GCGT CTCGAG TCGTTTCCAAATCCA 


CPQ372P 


GTGCGT CATATG AAACAACACTATTCTCTAAATA j 


GCGT CTCGAG TT/TCTTGTGGTTTTTCT 


CP8390P 


GTGCGT CATATG OGAGAGGTGCCTAAG 1 


ACTCGCTA GCGGCCGC TCTCCTAGACAGCCTT 


CP6402P 


GTGCGT CATATG AATGTTGCGGATCTCCTTT 


GCGT CTCGAG GAAGK3GGTTGGCCGT 


CP6446P 


GTGCGT CATATG TGTAATCAAAAGCCCTCTT 


GCGT CTCGAG GGGCTGAGGAGGAAC 


CP6520P j 


GTGCGT GCTAGC AAACACTACCTATCATTTTCT 


GCGT CTCGAG CAGAAAGGCTTTTCTTT 


CP6577P j 


GTGCGT CATATG AATTTAGGCTATGTTAATTTA 


GCGT CTCGAG GTTTTGTTTTTTGAAAGA 


CP6602P 


GTGCGT CATATG GCAGCATCAGGAGGCA 


GCGT CTCGAG TGACCAAGGATAGGGTTTAG { 
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CP6607P 


GTGCGT CATATG CCTCGTGGTGACACTTT 


GCGT CTCGAG CGCTGCTTCTTGCTC 


CP6615P 


GTGCGT CATATG TGCTCTCAAAAAACGACAA 


GCGT CTCGAG TGAAGAGGCGCCATC 


CP6624P 


GTGCGT CATATG GATGCGAAAATGGGA 


GCGT CTCGAG TCTTTGACATTCAAGAGC 


CP6672P 


GTGCGT CATATG ATTCCTACCATGTTAATG 


GCGT CTCGAG GTCATACAATTTCCTTATATA 


CP6679P 


GTGCGT CATATG TGCACTCACTTAGGCT 


GCGT CTCGAG CGAGTAGTTAGCACAAAC 


CP6717P 


GTGCGT GCTAGC AAGACAATCGTAGCTTCA 


ACTCGCTA GCGGCCGC GGCTGGCATATAGGT 


CP6784P 


GTGCGT GCTAGC AAATCAAGATGTTCTATTGATA 


GCGT CTCGAG TCCAAAACAACCCTCT 


CP6802P 


GTGCGT CATATG TGCGTAAGTTATATTAATTCCTT 


GCGT CTCGAG CAGTCGGGCTTGTTG 


CP6847P 


GTGCGT CATATG TCGGATCTTTTACGAG 


GCGT CTCGAG TTTTCTACACTGTTGTAATAAA 


CP6884P 


GTGCGT CATATG AATCAGCTGCTTTCT 


GCGT CTCGAG AGAGAAGGTAATTGTACC 


CP6886P 


GTGCGT CATATG TGTCTACTTATTATCTATCTCTAC 


GCGT CTCGAG TTCAGAAAAATGGCT 


CP6890P 


GTGCGT CATATG TCCCCACGACGACAA 


GCGT CTCGAG TCCTGCAGCATTTAGC 


CP6960P 


GTGCGT CATATG TGTGACGTACGGTCTA 


ACTCGCTA GCGGCCGC TTCACCTTGATTTCCT 


CP6968P 


GTGCGT CATATG TGCGATGCAAAAC 


ACTCGCTA GCGGCCGC GGAAGTATGCTTAGATATT 


CP6969P 


GTGCGT CATATG TGCTGTGGTTACTCTATT 


ACTCGCTA GCGGCCGC AAAAAGGTCATAGTATACCT 


CP7005P 


GTGCGT CATATG AAAACTGTGATATTGAACA 


GCGT CTCGAG CTGAGCTTCTATTTCTATTAT 


CP7072P 


GTGCGT CATATG CCCATTTATGGGAAA 


GCGT CTCGAG GTTGAGCAAAGGTTTG 


CP7101P 


GTGCGT CATATG TATTCGTGTTACAGCAA 


GCGT CTCGAG GAAAAATTCTTTAGGGAG 


CP7102P 


GTGCGT CATATG GCCGCTAAAGCAAAT 


GCGT CTCGAG TGAAAATGAAAGGATGGT 


CP7t05P 


GTGCGT GCTAGC AGTCTATATCAAAAATGGTG 


GCGT CTCGAG ATCTTTCATTTGGTTATCT 


CP7106P 


GTGCGT CATATG AAAGATTTGGGGACTCT 


GCGT CTCGAG GAATCCTAAGGCATACCTA ' 


CP7107P 


GTGCGT GCTAGC AGTATAGTCAGAAATTCTGCA 


GCGT CTCGAG GAAGCTAAGATTATAGCTACTTT 


CP7108P 


GTGCGT GCTAGC GCGGCCCTTTCCA 


ACTCGCTA GCGGCCGC TTTATGTATATGGAACAGATAGG 


CP7109P 


GTGCGT CATATG GGACATTTTATTGATATTG 


ACTCGCTA GCGGCCGC ATCATCAAGGTAGATAAAG 


CP7110P 


GTGCGT CATATG GGTTATTGCTATGTAATTACA 


GCGT CTCGAG TTCTGATTGGACTCCA 


CP7127P 


GTGCGT CATATG GTGGCTTTAACGATAGC 


ACTCGCTA GCGGCCG GCAGCCATCGTATTC 


CP7130P 


GTGCGT CATATG TTCAATATGCGAGG 


GCGT CTCGAG CTTCTTATTTGAACTTTG 


CP7140P 


GTGCGT CATATG ACAGCCGGAGCAGCT 


GCGT CTCGAG AGCACCCTCAA TTTCATTG 


CP7182P 


GTGCGT CATATG GGATATGTTTTCTATGTGATC 


GCGT CTCGAG GCTACTAAATCGAATCGA 


CP6262P 


GTGCGT CATATG ATCCCTGGATTAAGTTCA 


ACTCGCTA GCGGCCGC TTCACTGGGAGCTTGA 


CP6269P 


GTGCGT CATATG TACCAGGAGAATCTAAGAT 


ACTCGCTA GCGGCCGC GATTTTCTTCTTCAGCTC 


CP6296P 


GTGCGT CATATG GAGGAGGTGTCTGAGTAT 


ACTCGCTA GCGGCCGC ATGTTTCTTTTT ACTCTTTC T 


CP6419P 


GTGCGT CATATG GCTCCAGTCCGTGTT 


GCGT CTCGAG AAGTGTTCGTTGGAAGT 


CP6601P 


GTGCGT CATATG AATAAGCTACTCAATTTCGT 




CP6639P 


GTGCGT CATATG TTAAATTCAAGCAATTCA 


GCGT CTCGAG AGGAACTAAAACCTCATCT 


CP6664P 


GTGCGT GCTAGC GTTTTATTTCATG CTCAA 


ACTCGCTA GCGGCCGC CTTAGAAAGACTATTTTCTAAGTA 


CP6696P 


GTGCGT CATATG TGCGTGATAATGGG 


GCGT CTCGAG ATTCATCTTCGTAAAGAAT 


CP6757P 


GTGCGT CATATG GCAGTTGGTGGCGT 


ACTCGCTA GCGGCCGC CTGTCCCTCTGGAGC 


CP6790P 


GTGCGT GCTAGC AGTGAACACAAAAAATCA 


ACTCGCTA GCGGCCGC CTTATCGTCGTTATCAATA 


CP6814P 


GTGCGT CATATG CATGACGCACTTCTAAG 


GCGT CTCGAG TACAGCTGCGCGA 


CP6834P 


GTGCGT CATATG GTTATGGGAACCTATATCG 


GCGT CTCGAG TACATTTGTATTGATTTCAG 


CP6878P 


GTGCGT CATATG AACGTCCCTGATTCC 


GCGT CTCGAG GCTAGCGGCTCTTTC 


CP6892P 


GTGCGT CATATG CAGAAGCATCCTTCCT 


ACTCGCTA GCGGCCGC TCCTCTTTAGGAAATGG 


CP6909P 


GTGCGT CATATG TCCTCTTTAGGAAATGG 


GCGT CTCGAG CAGTGCCAAGTAGGGA 


CP7015P 


GTGCGT CATATG GCAGTACGATTAATTGTTG 


GCGT CTCGAG TTT ATTG T AGTCT ATTTT ATATTTC 


CP7035P 


GTGCGT GCTAGC AGCAGAAAAGACAATGA 


GCGT CTCGAG ATTTTGAGTGTCTTGCA 


CP7073P 


GTGCGT CATATG ATTACCATAAATCACGTG 


GCGT CTCGAG TATCCATOGACTTATAGC 


CP7085P 


GTGCGT GCTAGC TGTATTTTCCCTTACGTA 


ACTCGCTA GCGGCCGC GGATTCTGCATACTCTG ! 


CP7092P 


GTGCGT CATATG TCTCCTCTTCCTAAAAAA 


GCGT CTCGAG GGATTCATTACTGACCA 


CP7093P 


GTGCGT CATATG AAATACCGCTTCACG 


GCGT CTCGAG ATTCTGTAGGGCTACGT 


CP7094P 


GTGCGT CATATG GTACACTTCTCTCATAACCC 


GCGT CTCGAG TAAGTTTGTATTGCGGTAT 


CP7132P 


GTGCGT CATATG TTGTTATTAGGGACTTTAGGA 


GCGT CTCGAG TTTCCCAACCGCA 


CP7133P 


GTGCGT CATATG GCTGCGAATGCTC 


GCGT CTCGAG TAATTTAATACTCTTTGAAGG 


CP7177P j 


GTGCGT CATATG CCTACTCAAGTTAAAACAGA 


GCGT CTCGAG AAGTTTATATTTCAGCACTT 


CP7184P 


GTGCGT GCTAGC CATATAGGATTTTGCCA 


GCGT CTCGAG GTACTTAGCAAAGCGAT 


CP7206P 


GTGCGT GCTAGC AAGAAG CTA TATCACC CT A 


GCGT CTCGAG CACACCGAGGAAAC 


CP7222P 


GTGCGT CATATG GTAGTTTCAGAAGAAAAAGTC 


GCGT CTCGAG ACGTATGCGCAACTG 


CP7223P 


GTGCGT CATATG GAAGTATTAGACCGCTCT 


GCGT CTCGAG CGAGAAAAAGCTTCC 


CP7224P 


GTGCGT CATATG ATGAAGAAAATTCGAAA 


ACTCGCTA GCGGCCGC TAAGCATTCACAAATGA 


CP7225P 


GTGCGT CATATG CATATTTTGCTTGATCGT 




CP7303P 


GTGCGT CATATG CTTGTCTATTGTTTTGATCC 


GCGT CTCGAG AAAATATACGGAACTCGC j 


CP7304P 


GTGCGT GCTAGC G AAGTTT AT AGTTTTTCC C 


GCGT CTCGAG TTTTTGATTCCTT AAGAAG 


CP7305P 


GTGCGT CATATG GAAGTTTATAGTTTTCACCCT 


GCGT CTCGAG ACTCCTTGAGAAGGGAA 


CP7307P 


GTGCGT CATATG CTTAATCATGCTAAAAAGC 


ACTCGCTA GCGGCCGC CTCTTTTATTTTAGGAAGCT 
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CP7342P 


GTGCGT CATATG AAAAAAAAATTTATTTTCTACT 


ACTCGCTA GCGGCCGC CACACTCTGTTCTTCTG 


CP7347P 


GTGCGT CATATG TTTTCT AAGGA TTTGACTAA 


GCGT CTCGAG CGAAGCAGAAGTCGT 


CP7353P 


GTGCGT CATATG AAT ATG CCTGTTCCTTCT 


GCGT CTCGAG GGGGCGTAGGTTGTA 


CP7193P 


GTGCGT CATATG TGTTCCCTGGATCCT 


ACTCGCTA GCGGCCGC AGTTATCACTATATCCACAAG 


CP7248P 


GTGCGT GCTAGC CTTGAACATTCTAAACAAGAT 


GCGT CTCGAG ACGTAGTTTAAGAGCAGACT 


CP7261P 


GTGCGT CATATG TGTCTATCTGCCTACATAG 


GCGT CTCGAG TTTTGATG CTTCTTTCA 


CP7280P 


GTGCGT CATATG GACCAGAAAATTGAAAA 


GCGT CTCGAG AGAGGTCTTCTGAGTGC 


CP7302P 


GTGCGT CATATG AATTTCCATTGTAGTGTAGT 


GCGT CTCGAG GAACAGTTCGATTTGTG 


CP7306P 


GTGCGT CATATG CTTCCTTTATCAGGGCA 


ACTCGCTA GCGGCCGC TTCTTCAGGTTTCAGG 


CP7367P 


GTGCGT GCTAGC CGTTATGCCGAGGTC 


GCGT CTCGAG TTCGTGCATTTGGTG 


CP7408P 


GTGCGT CATATG TTG AAAATC C AG AAAAA 


GCGT CTCGAG ATTCATTTTCGGAAGAG 


CP7409P 


GTGCGT CATATG AGACGTTATCTTTTCATGGT 


GCGT CTCGAG CCCTTTGCTCTTTACATAG 


CP6733P 


GTGCGT ACTAGT TGTCACCTACAGTCACTAG 


GCGT CTCGAG GAATCGGAGTTTGGTA 


CP672BP 


GTGCGT ACTAGT AAGTCCTCTGTCTCTTGG 


GCGT CTCGAG GAAACAAAACTTAGAGCCC 



TABLE HI - Proteins with best results in FACS analysis 



cp number 


Molecular Weight (kDa) 


Fusion type 


Theoretical 


Western Blot 


6260 


97.5 


94; 70 


GST 


6270 


87.5 




GST 


6272 


78.0 


90 


GST 


6273 


58.6 


74; 64; 50 


GST 


6296 


31.1 




GST 


6390 


88.9 


102 


GST 


6456 


42.5 


89; 67,45 


GST 


6466 


57.5 


59; 56 


His 


6467 


59.0 


67 


GST 


6552 


28.4 


50; 27 


GST 


6576 


86.0 


79; 70; 62; 45 


GST 


6577 


17.3 


12 


GST 


6602 


43.4 


53; 42; 34 


GST 


6664 


54.5 


104; 45 


GST 


6696 


47.9 


95; 53 


GST 


6727 


130.0-142.9 


123; 61; 39 


His 


6729 


94.8 


multiple bauds 


GST 


6731 


95.5 


97 


GST 


6733 


97.1 


104 


His 


6736 


100.1 


98; 93; 66; 60 


GST 


6737 


101.2 


multiple bands 


GST 


6751 


100.2 


95; 71 


GST 


6752 


102.1 


97; 48 


His 


6767 


29.1 


28 


GST 


6784 


32.9 


35 


GST 


6790 


71.3 


multiple bands 


His 


6802 


29.7 




GST 


6814 


29.6 


28 


GST 
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6830 


177.4 


174; 91; 13 


GST 


6849 


57.3 


multiple bands 


GST 


6850 


7.4-9.4 


61; 14; 8 


GST 


6854 


42.2 




GST 


6878 


40.4 




GST 


6900 


28.0 




GST 


6960 


25.6 


75; 35 


GST 


6968 


34.6 


83; 53; 35 


GST 


6998 


39.3 


multinle hands 


GST 


7033 


68.2 


mnltinlft hjtnHc 


GST 


7101 


113 


105 


VJO ± 


7102 


63.4 




GST 


7105 




j\j 


VJO J. 


7106 


39.5 


72;46 




7107 


71 4 




Hie 
nis 


7108 


35.9 


35 


GST 


7111 


46.1 


51 


GST 


7132 


17.9 


57; 47; 17 


His 


7140 


36.2-29.8 


50; 38; 34 


GST 


7170 


34.4 


77; 33 


GST 


7224 


39.4 


40 


GST 


7287 


167.3 


180 


GST 


7306 


50.1 


50 


GST 



TABLE IV - FACS-positive proteins not found in C.trachomatis 



cp7105 


cp6390 


cp7106 


cp6784 


cp7107 


cp6296 ! 


cp7108 





TABLE V - Proteins identified by MALDI-TOF following 2D electrophoresis 



cp6270 


cp6733 


cp6900 


cp6552 


cp6736 


cp6960 


cp6576 


cp6737 


cp6998 


cp6577 


cp6752 


cp7033 


cp6602 


cp6767 


cp7108 


cp6664 


cp6784 


cp7111 


cp6727 


cp6790 


cp7170 


cp6728 


cp6830 


cp7287 


cp6729 


cp6849 


cp7306 
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CLAIMS 

1 . A protein comprising an amino acid sequence selected from the group consisting of SEQ IDs 97, 
1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 
55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 99, 101, 103, 105, 

5 107, 109, 111, 113, 115, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137, 139, 141, 143, 

145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 
183, 185, 187, 189, 191, 193, 195, 197, 199, 201, 203, 205, 207, 209, 211, 213, 215, 217, 219, 
221, 223, 225, 227, 229, 231, 233, 235, 237, 239, 241, 243, 245, 247, 249, 251, 253, 255, 257, 
259, 261, 263, 265, 267, 269, 271, 273, 275, 277, 279, 281, 283, 285, 287, 289, 291, 293, 295, 
10 297, 299, 301, 303, 305, 307, 309, 311, 313, 315, 317, 319, 321, 323, 325, 327, 329, 331, 333, 

335, 337, 339, 341, 343, 345, 347, 349, 351, 353, 355, 357, 359, 361, 363, 365, 367, 369, 371, 
373, 375, & 377. 

2. A protein having 50% or greater sequence identity to a protein according to claim 1 . 

3. A protein comprising a fragment of an amino acid sequence selected from the group consisting of 
15 SEQ IDs 97, 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 

49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 99, 
101, 103, 105, 107, 109, 111, 113, 115, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137, 
139, 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173, 175, 
177, 179, 181, 183, 185, 187, 189, 191, 193, 195, 197, 199, 201, 203, 205, 207, 209, 211, 213, 
20 215, 217, 219, 221, 223, 225, 227, 229, 231, 233, 235, 237, 239, 241, 243, 245, 247, 249, 251, 

253, 255, 257, 259, 261, 263, 265, 267, 269, 271, 273, 275, 277, 279, 281, 283, 285, 287, 289, 
291, 293, 295, 297, 299, 301, 303, 305, 307, 309, 311, 313, 315, 317, 319, 321, 323, 325, 327, 
329, 331, 333, 335, 337, 339, 341, 343, 345, 347, 349, 351, 353, 355, 357, 359, 361, 363, 365, 
367, 369, 371, 373, 375, & 377. 

25 4. A nucleic acid molecule which encodes a protein according to any one of claims 1 to 3. 

5. A nucleic acid molecule according to claim 4, comprising a nucleotide sequence selected from 
the group consisting of SEQ IDs 98, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 
36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 
88, 90, 92, 94, 96, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 
30 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 

168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200, 202, 204, 
206, 208, 210, 212, 214, 216, 218, 220, 222, 224, 226, 228, 230, 232, 234, 236, 238, 240, 242, 
244, 246, 248, 250, 252, 254, 256, 258, 260, 262, 264, 266, 268, 270, 272, 274, 276, 278, 280, 
282, 284, 286, 288, 290, 292, 294, 296, 298, 300, 302, 304, 306, 308, 310, 312, 314, 316, 318, 
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-193- 

320, 322, 324, 326, 328, 330, 332, 334, 336, 338, 340, 342, 344, 346, 348, 350, 352, 354, 356, 
358, 360, 362, 364, 366, 368, 370, 372, 374, 376, & 378. 

6. A nucleic acid molecule comprising a fragment of a nucleotide sequence selected from the group 
consisting of SEQ IDs 98, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 

5 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 

94, 96, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 
136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 
174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210, 
212, 214, 216, 218, 220, 222, 224, 226, 228, 230, 232, 234, 236, 238, 240, 242, 244, 246, 248, 
10 250, 252, 254, 256, 258, 260, 262, 264, 266, 268, 270, 272, 274, 276, 278, 280, 282, 284, 286, 

288, 290, 292, 294, 296, 298, 300, 302, 304, 306, 308, 310, 312, 314, 316, 318, 320, 322, 324, 
326, 328, 330, 332, 334, 336, 338, 340, 342, 344, 346, 348, 350, 352, 354, 356, 358, 360, 362, 
364, 366, 368, 370, 372, 374, 376, & 378. 

7. A nucleic acid molecule comprising a nucleotide sequence complementary to a nucleic acid 
15 molecule according to any one of claims 4 to 6. 

8. A nucleic acid molecule comprising a nucleotide sequences having 50% or greater sequence 
identity to a nucleic acid molecule according to any one of claims 4 to 7. 

9. A nucleic acid molecule which can hybridise to a nucleic acid molecule according to any one of 
claims 4 to 8 under high stringency conditions. 

20 10. A composition comprising a protein or a nucleic acid molecule according to any preceding claim. 

1 1. A composition according to claim 10 being a vaccine composition. 

12. A composition according to claim 10 or claim 11 for use as a pharmaceutical. 

13. The use of a composition according to claim 10 in the manufacture of a medicament for the 
treatment or prevention of infection due to Chlamydia bacteria, particularly Chlamydia 

25 pneumoniae. 
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